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I.  INTRODUCTION 


The  research  described  in  this  report  is  concerned  with  the  transient  behavior  of 
linearly  constrained  wideband  adaptive  array  sensors.  The  intent  of  tliis  analysis  is  to 
develop  ?•  computationally  inexpensive  constrained  sensor  which  is  capable  of  quick 
convergence  in  a  dynamic  signal  environment.  Various  structures  will  be  examined  and 
their  transient  performance  will  be  evaluated  through  simulation. 

iThe  use  of  multichannel  space-time  processors  has  a  proven  value  in  the  detection 

i 

and  estimation  of  signals  which  are  received  at  spatially  separated  sensors.  The  benefit  of 
utilizing  such  an  array  of  receiving  sensor  elements  to  improve  signal  reception  has  long 
been  recognized  in  the  fields  of  communications  [1],  radar  [2],  sonar  [3,4]  and  seismology 
[5]. 

tn  a  dynamic  signal  environment  it  is  desirable  to  have  the  processor  sense  the 

i 

I 

presence  of  interference  noise  sources  and  automatically  adapt  itself  in  order  to  both  suppress 

I 

the  interference  and  enhance  the  desired  signal  reception.  The  manner  in  which  these  dual 
functions  are  realized  is  through  the  use  of  an  adaptive  control  system  which  updates  the 
parameters  of  an  array  processor  in  order  to  minimize  some  performance  index.  This  process 
is  depicted  in  figure  1.  The  goal  of  the  entire  system  is  to  produce  an  output  which  is  the 
best  estimate  of  a  particular  waveform  of  the  composite  observation  data  in  some  statistical 
sense.  This  must  be  accomplished  with  little  or  no  a  priori  knowledge  of  the  signal 
environment.  The  first  adaptive  array  research  can  be  U’aced  to  the  late  1950’s  and  early 
1960’s  [6,7, 8,9].  Much  research  has  been  done  on  adaptive  sensors  since  these  early 


Figure  1  Adaptive  Array  Sensor 


developments,  and  the  use  of  adaptive  arrays  have  beer,  incorporated  into  many  modem 
signal  processing  systems. 

Thus,  the  problem  at  hand  is  characterized  by  a  need  to  optimize  the  reception  of 
one  or  more  desired  signals  when  multiple  desired  and  undesired  wideband  directional 
sources  impinge  upon  an  array  of  passive  receiving  sensor  elements.  The  traditional  solution 
to  this  problem  is  the  utilization  of  an  adaptive  tapped-delay-line  filter  following  each 
elernent  of  the  array.  The  rationale  for  this  solution  is  based  upon  the  stability  properties  of 
the  finite  impulse  response  class  of  filters  coupled  with  the  ability  of  the  tapped-delay-line 
filter  to  process  signals  which  encompass  an  appreciable  bandwidth.  The  choice  of  adaptive 
algorithm  for  updating  the  coefficients  of  the  adaptive  filters  in  this  study  is  restricted  to  the 
computationally  modest  stochastic  gradient  class. 

The  difficulty  encountered  with  the  traditional  method  of  processing  described  above 
is  that  a  dependency  exists  between  the  speed  of  convergence  of  the  adaptive  processor  and 
the  range  of  the  eigenvalues  of  the  observation  data  correlation  matrix.  The  methodology 
which  this  research  follows  is  to  utilize  different  adaptive  structures  in  an  attempt  to 
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transfOiTn  the  observation  data  into  a  domain  such  that  the  resulting  transformed  correlation 
matrix  exhibits  a  smaller  eigenvalue  spread  or  the  aforementioned  dependency  is  relieved. 

A.  Organizaticn  of  the  Report 

This  section  will  introduce  the  organization  of  the  research  to  be  presented.  Section 
I.B  reviews  estimation  theory  and  derives  the  discrete-time  Wiener  filter.  In  Section  I.C, 
we  extend  the  previous  results  to  the  multichannel  wideband  array  case.  Section  I.D  then 
derives  and  analyzes  the  classical  least-mean  square  (LMS)  algorithm. 

One  short  coming  of  the  early  LMS  adaptive  array  systems  was  the  degradation  of 
the  desired  signal  while  attempdng  to  minimize  interference  in  the  receiving  sensor 
sidelobes.  Through  the  imposition  of  hard  constraints  on  certain  aspects  of  the  processor 
we  can  guarantee  some  desirable  responses  regardless  of  the  external  environment 

The  constraint  of  interest  in  this  research  is  realized  by  defining  die  frequency 
response  of  the  processor  in  the  direction  of  the  desired  signal.  Through  the  enforcement 
of  this  frequency  response  in  the  desired  signal  direction,  one  guarantees  that  the  adaptation 
process  can  not  cause  its  degradation. 

The  second  chapter  derives  three  linearly  constrained  wideband  adaptive  array 
sensors  with  tapped-delay-line  structure.  The  first  of  these,  termed  the  direct  form,  was 
originally  conceived  by  Frost  [11].  The  second  form  is  a  partitioned  realization  which  is 
Shown  to  be  identical  to  the  direct  form.  This  form,  introduced  by  Griffiths  [15],  is  derived 
solely  to  facilitate  the  development  of  the  third  form.  Tlie  final  form,  termed  the  Generalized 
Sidelobe  Canceller  (GSC),  was  first  presented  by  Applebaum  and  Chapman  [12],  and  later 
extended  by  Griffiths  [13].  The  GSC  form  is  then  used  extensively  in  this  research. 
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The  thiid  chapter  is  concerned  with  replacing  the  GSC  tapped-delay-line  filter 
structure  with  orthogonal  transform  domain  filter  sUuctures.  The  Discrete  Fourier 
Transform  (DFT),  Discrete  Cosine  Transform  (DCT),  lattice  and  Gram-Schmidt  structures 
are  examined,  and  a  new  frequency  subband  normalization  algorithm  is  introduced. 
Simulation  results  arc  presented  to  compare  the  different  structures. 

The  fourth  chapter  provides  an  overall  comparison  of  the  different  structures 
considered,  examines  the  results  of  the  last  chapter,  presents  the  conclusions  of  this  research 
and  identifies  areas  for  further  research. 

Estimation  Theory  and  Wiener  Filtering 

This  section  considers  discrete  time  estimation  tlieory  and  derives  the  scalar  form  of 
the  Wiener  filter.  The  derivation  in  this  section  follows  Gelb  [32]  and  Widrow  [20].  For  a 
more  thorough  treatment  and  derivation  in  continuous  time,  one  is  referred  to  Van  Trees 
[16].  In  simple  terms,  estimation  is  concerned  with  the  use  of  information  derived  from 
observations  in  order  to  make  decisions  about  parameters  of  interest  that  are  optimal  in  some 
sense.  The  adaptive  sensor  problem  >s  to  estimate  a  signal  of  interest  s(k),  which  is  observed 
only  in  the  presence  of  additive  noise  n(k).  That  is  to  say,  given  a  received  ob.servation 
signal  sequence  x(k)  such  that 

x(k)  =  s(k)+n(k).  /t=l,2,...  (1-1) 

we  desire  to  process  it  in  order  to  obtain  an  estimate  y(k)  =  s(k).  The  observation  data  x(k) 
is  a  random  variable  whose  statistics  are  formed  from  those  of  both  the  signal  and  the  noise. 
The  processing  or  filtering  is  shown  in  figure  2.  In  section  A  of  this  chapter  we  mentioned 
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Figure  3  Tapped-Delay-Line  FIR  Filter 


that  we  were  interested  in  an  array  processor  which  had  a  finite  impulse  response.  The  FIR 
filter  is  represented  by  its  impulse  response  denoted  h(k)  on  the  box  in  figure  2. 

Consider  the  linear  tapped-delay-linel(TDL)  filter  shown  in  figure  3.  This  filter 
clearly  has  a  finite  duration  impulse  response  (Characterized  by  the  sequence 

h{k)  =  w  (1-2) 
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where  U’  is  a  J-dimensional  vector  and  the  filler  is  therefore  termed  a  FIR  filter  of  order  J. 
The  notation  Z"'  in  the  Ixixes  of  figure  2  represent  a  unit  lime  delay.  We  assume  that  the 
filler  is  driven  by  the  random  priKVss  x(k)  and  that  this  process  is  wide-sense  stationary;  it 
is  characterized  by  a  mean  value  which  is  independent  of  lime 

£l.vU)|  =  a  (1-.1) 

where  a  is  a  constant  (assumed  zero  for  simplicity)  and  a  correlation  function 


£l.i(m).t(n)l  s  r„(mz>)  =  r„(m-n)  =  rxt(X)  (1-4) 

with  T  being  an  integer  value.  The  filter  output  may  be  expres.sed  as  the  convolution  sum 

^  (1-5) 

.v«:)  =  X  w*At(t-m+l) 

mat 


We  now  assume  for  the  following  derivation  that  there  is  some  desired  reference 
signal.  d(k),  available  which  repre.sents  the  desired  .system  output.  Then  the  residual  or  error 
signal  e(k)  is  defined  as 

e{k)  =  d(k)-y{k)  (1-6) 

We  desire  to  find  the  optimal  values  of  the  filter  coefficients  W,  which  minimize  this  error 
signal  in  some  statistical  sease.  In  Wiener  filter  theory  [19)  the  performance  function  that 
is  u.sed  to  optimize  the  filler  coefficients  is  the  mean-square  value  of  the  error  .signal. 

The  mean-square  value  of  equation  (1-6)  is  represented  as 

£U*«:)1  =  £[rf^(t))  -  2  £[</(*)  y{k)]  +  £Lv^it)l  (1-7) 

and  through  the  .substitution  of  equation  (1-5)  into  equation  (1-7)  we  may  write 

,  ,  ^  ^  ^  *  (1-8) 
£lElt)I  =  Eld  ^(k)]  -  2  ^  £[</(«:)  Jt(t-m+l))  £Ua-m+l).r(l:-n+l)] 

in«  I  ii«l 
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We  now  assume  that  the  medium  through  which  the  signal  propagates  to  reach  the  sensor 
is  linear  and  time-invariant.  Then  the  desired  signal  component  s(k)  of  the  observation  data 
x(k)  is  related  through  a  linear  time-invariant  transformation  to  the  desired  reference  signal 
d(k).  Furthermore,  we  then  assume  that  x(k)  and  d(k)  are  jointly  stationary.  This  means 
that  the  expre.ssion  in  equation  (1-8)  is  the  sum  of  the  mean-.squared  value  of  the  desired 
response,  a  function  of  the  cross-correlation  of  the  desired  response  with  the  observation 
signal  and  a  function  of  the  correlation  of  the  received  ob.servation  signal.  Thus,  equation 
(1-8)  may  be  written  as 

^  J  J  (1-9) 

£le^(*))  =  E[d  ^(i)l  -  2  X  rxd(m-l)  +  X  X  rx>(n-m) 

m-t  msl  ns] 

where  the  cross-correlation  and  input  correlation  are,  respectively 

rid(m-l)H  £[</(*)  1)1  (1-10) 

rxx(n-m)  ^  £(jr(*-m+l)  x(i(-n+l))  (1-11) 

It  is  convenient  to  now  change  notation  from  scalar  repre.sentation  to  matrix  form,  and  define 
the  observation  vector  and  weight  vector  as 

r  x(k)  fwit/t)]  (1-12) 

x(*-l)  W2{k) 

X{k)=  ;  W(l:)=  • 

x(i(-y+l)J  Wj{k) 

where  we  have  explicitly  shown  the  time-dependency  of  the  filter  coefficients.  Suppressing 
the  matrix  time-varying  notation  and  recognizing  from  equation  (1-5)  that  the  output  may 

be  expressed  as  y  =  W^X,  equation  (1-9)  is  equivalent  to 


E[t\k)]  =  E[d  hk)]  -  2  RxlW-^-  W  ^RxxW 


(1-13) 
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RxxW=R.d  (i-17) 

and  we  find  the  optimal  weight  vector  is 

Wcp,  =  Rxl}Rxd 

which  is  the  Wiener-Hopf  equation  in  matrix  form.  Examining  equation  (1-17)  in  scalar 
form,  the  optimal  weight  values  are  given  by  the  solution  to  the  equation 


X  '^opim  rxxim-n)  =  r»rf(n-l) ,  n=l,2....  J 


A  tapped-delay-line  filter  whose  impulse  response  is  defined  by  equation  (1-18)  or  (1-19) 
is  said  to  be  optimal  in  a  mean-square  sense.  The  filter  output  realized  by  Wopi  is  denoted 
and  is  the  best  estimate  (in  a  mean-square  sense)  of  the  desired  response  given  the 
observation  input.  This  may  be  expressed  as 

J  (1-20) 

yopAk)  =  . Jc-M)  =  X  x(*-m+l) 


Therefore,  using  (1-10)  and  (1-11),  equation  (1-19)  can  be  written  as 


X  '^opt„  J?[A:(il:-m+l)x(A-n+l)]  =  £[rf(jt)x(*-n+l)],  n=l,2,...y 


This  is  equivalent  to  the  statement 


R[{d(k)-'^  Wopt„  x(/t-m+l))x(jt-/i+l)]  =  0 ,  n=\,2,..J 


and  finally,  using  (1-20) 


E[(d(k)  -  yopi{k))x(k-n+l)]  =  £(E<,p)(<:)x(A:-n+l)]  =  0 ,  n=l,2,..J 


where  Copt  is  defbed  as 


tapi(.k)  =  d(k)  -  yopiik) 


(1-24) 


Rgura  4  Orthogonality  Condition 

Two  important  results  now  appear  from  this  derivation.  First,  equation  (1-23)  states  that  the 
error  signal  and  a:«y  of  the  observation  signals  are  orthogonal  in  the  optimal  filter.  Second, 
the  error  signal  topi  and  the  output  yopt  ane  orthogonal  since 


J  J  (1-25) 

EltopMyoplik)]  =  E[tapi{k)'^  H'<,p(.,i(*-ffr+l)]  =  Wopi„  E[Zopik)x(k-n+\)]  =  0 

ms|  msl 

This  condition,  depicted  in  figure  4,  ensures  that  the  error  signal  vector  is  minimum. 


1  We  now  extend  the  results  of  the  previous  section  to  the  multichannel  case  of  interest. 
The  ^deband  multichannel  model  is  depicted  in  figure  5  for  an  array  composed  of  K  sensor 
elements  and  J  taps  per  element  It  was  mentioned  in  section  I.B  that  signals  whose  spectrum 
can  not  be  adequately  characterized  by  a  single  frequency  must  be  processed  by  a  filter  which 
is  capable  of  realizing  a  broadband  frequency  response.  If  the  TDL  filter  tap  spacing  is 
sufficiently  close  and  the  number  of  taps  is  large,  then  the  filter  will  approximate  an  ideal 
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Figure  5  Wideband  Muitichannei  Array 

filter  which  exhibits  exact  control  of  gain  and  phase  at  each  frequency  of  interest.  The 
sampling  theorem  [49]  may  be  used  to  define  the  filter  bandwidth.  Consider  a  continuous 
input  signal  which  is  sampled  by  one  TDL  filter.  The  sample  sequence  defined  by  the  signals 
appearing  at  the  TDL  taps  uniquely  characterize  the  corresponding  waveform  from  which 
it  was  generated  provided  that  the  continuous  signal  is  bandlimited  with  its  highest  frequency 
component  /max  less  than  or  equal  to  one-half  the  sample  frequency  corresponding  to  the 
time  delay  A,  or  /max  ^  1/2  A.  The  total  bandwidth  of  a  bandlimited  signal  is  2/max.  so  that 
a  TDL  can  uniquely  characterize  any  continuous  signal  having  a  bandwidth  less  than  or 
equal  to  1/A  Hz;  the  signal  bandwidth  of  the  TDL  filter. 

The  received  observation  data  for  the  array  in  figure  5  is  the  sum  of  the  directional 
signals  impinging  upon  the  array  and  the  thermal  noise  present  on  each  element  The  signals 
are  assumed  to  have  been  produced  by  sources  in  the  far  field  which  propagate  through  the 


medium  surrounding  the  sensor.  We  now  define  the  KJ-dimensional  vectors  X{k)  and 
Wik)  as 


m) 

X,^k-^) 


JTKMi-DA) 


W(Jt)=  . 


where  Xt{k-IA)  for  /  =  is  the  K-dimensional  vector  of  the  observation  signals 

present  at  the  column  of  weights  following  the  l-th  delay.  The  signals  are  a.ssumed  to  be 
plane  waves,  so  that  if  we  let  u  representing  the  directional  signal  unit  vector,  r  be  the  sensor 
coordinate  vector  and  v  denote  the  propagation  velocity,  then  the  intersensor  delay  may  be 
written  as 


where  the  (•)  operator  is  the  standard  vector  inner  product.  Assuming  that  the  reference 
element  is  the  top  sensor  of  figure  5,  then  we  may  express  the  K  dimensional  vector  of 


directional  signals  St(k)  as 


si(k)  j  si(k) 

S2(k)  si(k-Xi) 


SK*)=  .  = 


Ti) J 

For  the  K-dimensional  vector  of  array  observation  signals  Xt{k)  the  correlation 
matrix  ^(x)  is  of  dimension  K  x  K  and  is  given  by 

K,(x)  =  E[X,{k)X,^{k-x)]  (1-29) 
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For  the  KJ-dimensional  vector  of  all  observation  signals  X(,k)  the  correlation  matrix  is  of 
dimension  KJ  x  KJ,  and  is  given  by 

which  may  be  written  as 


^x(T) 


^(T+A) 

^«(t) 


^x(T+(J-l)A) 


(1-31) 


[KjrCC-(/-l)A)  •  ^  ^(T)  J 

We  assume  that  the  signal  and  noise  components  of  the  observation  data  are  independent. 
Then 


i?ju(x)  =  /?„(t) +  «„„(!)  (1-32) 

where  Rssii)  and  /?nn(T)  are  the  KJ  x  KJ  dimensional  correlation  matrices  of  the  received 
desired  signal  component  and  noise  component,  respectively,  of  the  observation  data: 

^?„(x)  =  £lS(Jt)sVx)]  (1-33) 

Rnn{x)  =  E[N(k)S\k-x)]  ( 1  -34) 

At  a  zero  time  shift,  it  is  well  known  [  1 1 ,20,33]  that  foi  the  case  of  interest  Rxx  and  Rnn  are 
positive  definite  matrices  and  Rss  is  generally  at  least  positive  semi-definite.  For  the 
remainder  of  this  research,  any  second  moment  not  explicitly  containing  a  time  delay 
argument  will  be  meant  to  denote  the  second  moment  at  zero  time  delay. 

The  optimal  weight  vector  in  the  minimum  mean-square  error  sense  for  the  wideband 
multichannel  array  of  figure  5  is  given  by  the  Wiener-Hopf  equation 

Wop,  =  RZ}R,d  (1-35) 
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where  Rxx  is  now  of  dimension  KJ  x  KJ  as  shown  in  equation  (1-31)  and  Rxd  is  a  KJ  x  1 
vector  formed  in  a  manner  analogous  fo  equation  (1-14),  but  using  the  vector  X(k)  given  in 
equation  ( 1-26).  This  is  the  same  solution  as  that  of  Widrow  [20,  equation  2. 17]. 


The  requirement  which  exists  for  the  array  processor  is  to  solve  the  Wiener-Hopf 
equation  for  the  optimal  weight  vector.  This  solution  requires  knowledge  of  both  Rxx  and 
Rxd-  In  the  problems  of  interest,  the  correlation  matrix  Rxx  is  unknown  while,  in  general, 
the  cross-correlation  vector  Rxd  may  not  be  available.  One  method  of  obtaining  the  solution 
would  be  the  direct  estimation  of  these  values  followed  by  their  substitution  into  the 
Wiener-Hopf  equation.  Monzingo  and  Miller  [34]  describe  the  drawbacks  of  this  approach. 
In  summary:  potentially  serious  computational  problems  arise  in  computing  and  inverting 
Rxx',  the  number  of  measurements  and  computations  needed  to  accurately  estimate  the 
elements  of  Rxx  and  Rxd  is  large  and  requires  repetition  upon  change  of  the  input  signal 
statistics;  and  the  implementation  of  a  direct  solution  requires  highly  accurate  estimates  and 
results  in  an  open  loop  control. 

Another  method  of  solving  the  Wiener-Hopf  equation  is  to  solve  for  the  optimal 
weight  vector  iteratively  through  a  gradient  search  procedure.  The  LMS  algorithm  [23]  is' 
one  of  the  family  of  gradient  search  techniques  for  descending  towards  the  performance 
surface  minimum.  Not  having  a  priori  knowledge  of  Wop, ,  we  begin  at  some  arbitrary 
weight  va’ue  W(0)  and  estimate  the  gradient  at  this  point  We  choose  the  next  weight  value 
to  be  the  value  of  the  current  weight  plus  an  increment  proportional  to  the  negative  slope 
estimated.  This  procedure  leads  to  the  iterative  procedure 
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W{k+l)=  W{k)  +  n(  -fyw{k))  (1-36) 

where  i^wik)  is  the  estimate  of  the  gradient  at  time  k  and  |i  is  die  step  size  of  the  incremental 
walk  to  the  bottom  of  the  performance  bowl.  It  will  soon  be  seen  thet  p  also  controls  the 


stability  and  tlie  rate  of  convergence  of  the  algorithm. 

The  key  to  the  LMS  algorithm  is  that  it  views  c^{k)  to  be  an  estimate  of  the 


mean-square  error  jE[£^(ii:)].  Thus,  the  gradient  estimate  is  given  by 

J(lw'k)  =  =  2E(fe)"  ~  ^  =  -2E(k)X(k) 

’  dW(k)  ^  ’?W{k)  dW{k) 


(1-37) 


yielding  the  LMS  algorithm 

W(*+l)=  W(J(:)  +  2nE{/t);if(it)  ;  (1-38) 

The  LMS  algorithm  is  now  demonstrated  through  a  simple  narrowband  scalar 

i 

example  motivated  by  Widrow  [20].  The  purpose  of  this  example  is  to  provide  a  graphical 
understanding  of  the  gradient  search  technique  used  in  the  LMS  algorithm  and  to  examine 
the  effects  of  noise  in  the  development.  Consider  a  tapped-delay-line  filter  with  one  delay 

I 

2nk  ' 

and  two  taps  (order  J=2).  The  input  signal  is  a  sinusoid  given  by  i(it)  =  sin(-^)  and  the 


desired  signal  is  d(k)  =  2  cos( 


2nk 
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).  The  observation  data  xQc)  was  then  formed  in  both  the 


case  where  n(k)=0  and  for  n(k)  being  a  zero-mean  gaussian  random  variable  with  power 
Pn  =0.01.  The  correlation  matrix  and  the  cross-correlation  vector  were  computed  from 
equations  (1-14)  and  (1-15).  The  weight  vector  was  found  from  (1-18),  and  the  performance 
surface  given  by  (1-13)  was  then  plotted  as  shown  in  figure  6.  The  LMS  algorithm  was  then 
executed  through  500  iterations  with  a  step  size  p  =  0.05  and  an  initial  condition  of 
VT(0)  =  0.  The  contou-  plot  in  figures  7  and  8  depict  the  noise  free  and  additive  noise 


performance  surface  searches,  respectively.  Figures  9  and  10  show  Lhe  weight  vector 
transients.  Figures  11  and  12  present  plots  of  the  estimated  mean-square  error,  termed 
’learning  curves’  by  Widrow.  Figures  13  and  14  show  the  estimation  error,  a  measure  of 
the  convergence  of  the  adaptive  algorithm,  for  the  no  noise  and  noisy  cases,  respectively. 
The  reduction  of  the  area  under  the  learning  curve  and  the  reduction  of  estimation  error  arc 
the  two  key  indicators  of  the  dynamic  response  or  transient  behavioi  of  the  adaptive  filter. 

The  convergence  of  the  LMS  algorithm  to  the  optimal  weight  vector  solution  is  now 
considered.  The  derivation  in  this  section  follows  that  of  Widrow  [20],  As  previously 
mentioned,  the  LMS  algorithm  utilizes  the  square  error  as  an  estimate  of  the  mean-square 
error.  From  equation  (1-37)  we  see  that  this  leads  to  an  unbiased  estimate  when  the  weight 
vector  is  held  constant. 

=  -mm)  -  =  2\^W-R,iyMw{k)  (1-39) 


Figure  6  Performance  Surface 


Rgure  9  Noise  Free  Weight  Vector  Trajectory 
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Since  the  weight  vector  is  not  constant  but  changes  with  each  iteration,  we  now  examine  the 
dynamic  weight  convergence.  It  is  assumed  in  the  following  development  that  successive 
observation  data  vectors  are  independent,  which  allows  the  weight  vector  to  be  treated  as 
though  it  was  independent  of  the  observation  data  process.  It  is  noted  that  while  this  may 
be  unnecessaty .  it  will  be  used  to  simplify  the  following  derivation.  Then  the  expected  value 
of  equation  ( 1-38)  yields 

EIW{M)]  =  £(W(*)]  /fxxElWa-)])  (1-40) 

Rearranging  equation  (1-18)  or  (1-35)  to  yield  R^d  =  R,xWop,  we  find 

Eimk^D]  =  {l-2uRx,)E[W(k)]  +  liiRxxWop,  (Ml) 

Proceeding  with  the  derivation  at  hand,  we  define  the  weight  error  vector  to  be  the  translation 
vector  T 

nk)=W{k)-Wop,  (1-42) 

In  order  to  diagonalize  (/  -  2|i^«).  the  coefficient  matrix  of  £[W(jt)]  in  (1-41),  we  define 
the  unitary  matrix  Q  which  performs  a  rotation  upon  T  (introducing  the  new  vector  VO  and 
the  similarity  operation  upon  the  correlation  matrix 

T=QV  (1-43) 

Rxx  =  QAQ^  (1-44) 

where  A  is  a  square  diagonal  matrix  whose  element  on  the  l-th  row  is  the  l-th  eigenvalue  of 
the  correlation  matrix.  The  l-th  column  of  Q  is  the  corresponding  eigenvector.  Using  the 
Wiener-Hopf  equation  and  substituting  (1-42)  into  (1-41),  the  weight  vector  error  equation 
becomes 

£(r(l:+l)]  =  (/-2ui?„)E[r(*)]  (M-S) 
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where  the  simplification  falls  out  of  the  algebra  upon  the  expansion.  Utilizing  our 
transformations  defined  in  (1-43)  and  (1-44),  we  may  express  this  as 

£[V(i+l)]  =  (fi-'/fi-24C-'«„e]B[V(t)]  =  (/-2nA)£[V(it)]  (146) 

Finally,  through  the  iteration  of  (1-46),  we  find  the  solution 

£[V(;t)l  =  (/-2(iA)V(0) 

The  weight  convergence  question  is  now  answered  by  considering  whether  the  error  vector 
W{k)  -  Wopi  converges  to  zero.  Mathematically,  this  is  equivalent  to 

Uin(/-2nA)*  =  0 


or,  since  both  matrices  inside  the  parenthesis  are  diagonal, 

lim(l-2nX,)*  =  0 

k 

where  X,  denotes  the  i-th  eigenvalue  of  the  observation  correlation  matrix.  Thus,  for 
convergence,  the  step  size  p  must  be  chosen  such  that 

0<^<^  <■■«» 

where  Xma*  is  the  largest  eigenvalue  of  Rxx.  The  translation  vector  V{k)  thus  obeys  a 
trajectory  which  is  the  sum  of  n  modes,  where  the  correlation  matrix  is  of  dimension  n  x  n. 


and  the  i-th  mode  ic  proportional  to  ( 1  -  2  p  X,) .  The  speed  of  convergence  is  governed  by 
p.  If  the  step  size  is  too  large  for  equation  (1-50)  to  be  satisfied,  then  one  or  more  modes 


of  the  translation  vector  will  be  larger  than  unity  in  magnitude  and  the  error  will  increase  in 
time.  For  a  fixed  step  size,  the  speed  of  convergence  is  dominated  by  the  slowest  mode. 
The  eigenvalue  spread  of  the  correlation  matrix  (the  ratio  of  largest  to  smallest  eigenvalues 
or  the  condition  number  of  the  matrix)  is  therefore  an  indicator  of  the  convergence  speed  of 


the  LMS  algorithm.  The  larger  that  the  eigenvalue  spread  of  the  observation  correlation 
matrix  is,  the  slower  the  convergence  of  the  algorithm. 

The  above  results  can  be  used  to  better  understand  figures  7  through  14.  Comparing 
figures  7  and  8,  it  is  apparent  that  the  noisy  gradient  estimate  does  not  immediately  approach 
the  minimum  mean-square  error  solution,  but  walks  around  the  bottom  of  the  bowl.  This 
causes  the  weight  jitter  seen  in  comparing  the  noisy  weight  vector  transients  of  figure  9  to 
those  of  figure  10.  From  equation  (1-49),  we  see  that  the  learning  curves  in  figures  1 1  and 
12  should  decay  according  to  geometric  ratios  of  the  form  (1  -2jiX,),  yielding  a  time 
constant  for  the  i-th  mode  of 

,  _J_  (1-51) 

'“4nXj 

where  we  have  used  the  convention  of  Widrow  that  the  time  constant  of  the  mean-square 
error  learning  curve  is  one-half  that  of  the  geometric  decay  [  14].  The  noisy  estimate  of  the 
mean-square  error  causes  the  jitter  shown  in  figure  12.  The  estimation  error  in  figures  13 
and  14  depict  the  amount  of  time  it  takes  for  the  adaptive  filter  to  learn  the  amplitude 
difference  and  phase  shift  between  the  desired  signal  and  the  observed  signal  and,  in  figure 
14,  the  error  is  at  first  sinusoidal  and  then  becomes  increasingly  random. 

The  requirements  of  the  LMS  algorithm  are  evident  through  the  examination  of 
equations  (1-39)  and  (1-40)  and  the  fact  that  the  LMS  approximation  is  accomplished  by 
estimating  the  unknown  average  values  with  the  available  present  values;  in  essence 
dropping  the  expectation  operator.  Specificalb',  a  desired  signal  is  required  and  the 
correlation  matrices  are  approximated  by 


k„(k)  =  X(k)X^{k) 

(1-52) 

k,d{k)  =  X{k)d{k) 

(1-53) 

The  desired  signal  requirement  has  been  addressed  by  Widrow,  who  derived  training 
schemes  to  provide  this  signal  [17],  and  Compton,  who  demonstrated  that  the  algorithm  can 
be  successfully  applied  to  communications  when  the  desired  information  carrying  signal  is 
unknown  but  some  of  the  characteristics  of  it  ai^e  available  [  18j. 

Griffiths  modified  the  LMS  algorithm  through  recognizing  that  a  priori  knowledge 
of  the  desired  signal  correlation  function,  its  direction  of  arrival  and  the  array  geometry  allow 
the  received  cross  correlation  vector  to  be  defined  [33].  This  relieves  the  algorithm  of  the 
desired  signal  requirement  and  allows  the  vector  Rxd  to  be  formed  off-line.  The  only 
real-time  estimate  that  must  be  formed  in  Griffiths’  LMS  scheme  is  the  correlation  matrix 
Rxx- 

The  principal  algorithm  of  interest  in  this  research  is  a  linearly  constrained  LMS 
algorithm.  This  algorithm,  presented  in  the  next  chapter,  will  be  seen  to  utilize  a  directional 
constraint  in  order  to  completely  relieve  any  requirement  of  a  desired  signal  or  its  statistical 
characterization.  In  fact,  the  desired  signal  d(k)  is  taken  to  be  identically  zero.  The 
correlation  matrix  Rxx,  as  with  Griffiths’  algorithm,  is  the  only  required  real-time  estimate 
and  the  necessary  a  priori  information  is  simply  the  direction  of  arrival  of  the  desired  signal. 
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II.  CONSTRAINED  PROCESSORS  WITH  TAPPED-DELAY-LINE  STRUCTURE 

The  first  constrained  adaptive  array  to  be  derived  is  termed  the  "direct  form".  A 
second  form,  which  is  a  partitioned  constrained  processor,  will  then  be  developed  from  the 
direct  form.  This  processor  and  the  direct  form  will  be  shown  to  have  identical  performance. 
A  third  form,  referred  to  as  the  "Generalized  Sidelobe  Canceller"  or  GSC  will  be  shown  to 
be  a  partitioned  processor  which  utilizes  a  methodology  to  separate  the  constraint  from  the 
adaptive  beamformer  in  a  manner  which  results  in  an  unconstrained  adaptation. 

A.  Direct  Form  Processor 

The  derivation  in  this  section  follows  that  of  Frost  [11].  The  geometrical 
interpretation  and  portions  of  the  algorithm  development  have  been  expanded.  A  direct 
form  constrained  processor  with  K  sensors  and  J  taps  per  sensor  is  displayed  in  figure  15. 
We  assume  that  the  array  has  been  electronically  pre-steered  so  as  to  be  parallel  to  the  desired 
signal’s  wavefront  through  the  use  of  the  shown  steering  time  delays.  This  is  referred  to  as 
a  signal  aligned  array. 

The  received  signal  X(t)  is  a  composite  of  the  desired  signal  S(t)  and  noise  N(t).  The 
noise  itself  may  be  a  composite  of  passive  and  active  noises.  An  example  of  passive  noises 
would  be  the  thermal  noise  present  on  each  element,  while  an  example  of  active  noises  might 
be  hostile  jammers.  The  signals  are  sampled  and  processed  so  that  the  received  signal  at  the 
k-th  sample  may  be  written  as: 

X(k)  =  S(k)  +  N{k)  (2-1) 
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Figure  15  Direct  Form  Tapped-Deiay-Line  Processor 

The  signals  in  this  derivation  are  assumed  to  be  realizations  of  zero  mean  stochastic 
processes  with  unknown  second  order  statistics.  The  notation  that  will  be  used  to  describe 
the  signal  correlation  matrices  is: 

(2-2) 

RnnmElN(k)N^(k)] 

R..»ElS(k)S^(k)] 

The  desired  signal  S(k)  is  assumed  to  be  uncoirelated  with  the  noise  N(k). 

The  signal  X(k)  impinges  on  the  sensor  array  and  arrives  at  each  element  at  a  different 
time  determined  by  the  array  spacing  and  the  composite  signals  direction  of  arrival.  Since 


the  array  is  assumed  signal  aligned,  the  look  direction  waveforms  S(k)  are  the  same  down 
each  column  of  the  array. 

The  KJ  dimensional  stacked  vectors  of  look  direction  waveforms  S  (k),  noises 
N  (k)  and  weights  Wdf  (.k)  may  be  written  as; 


ni(k) 

WDFi  { k ) 

»2(k) 

WDFz  ( k ) 

s(k) 
sik- A) 
s{k  -  A) 

s(k-A) 

.  N(k)  = 

,  Wdf {k)= 

i(jfc-(y-i)A) 

• 

< 

1 

1 

nKj(k) 

wdfk,  {k) 

where  A  is  the  delay  time  introduced  between  successive  taps  in  the  array.  The  subscript 
DF  will  be  suppressed  for  the  remainder  of  section  A,  and  will  only  be  used  in  subsequent 
sections  to  clarify  the  array  form  being  referenced  as  needed. 

1)  The  Adaptive  Algorithm:  The  algorithm  for  adapting  the  direct  form  weights 
must  be  capable  of  maintaining  a  chosen  frequency  response  in  the  look  direction  while 
minimizing  the  output  power  in  other  directions;  power  which  is  due  to  undesirable  noises. 
As  previously  shown,  the  desired  signal  produces  identical  components  on  each  column  of 
taps  while  the  noise  arriving  from  other  directions  will  not  in  general  produce  equal 
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components  at  the  tap  inputs.  The  received  composite  signal  at  each  tap  is  multiplied  by  a 
corresponding  weight  and  summed  to  produced  the  array  output.  It  is  therefore  evident  that 
under  these  conditions  the  multichannel  processor  appears  as  a  single  channel 
tapped-delay-line  with  respect  to  the  desired  signal.  This  equivalent  single  channel  filter 
has  weights  which  are  equal  to  the  sum  of  the  weights  in  the  corresponding  vertical  column 
of  the  multichannel  processor.  This  is  shown  in  figure  16  for  a  three  tap  three  channel  array. 
Through  constraining  the  sums  of  the  weights  in  each  of  the  J  vertical  columns  to  have  some 
value  fj ,  we  have  in  fact  fixed  the  frequency  response  of  the  processor  in  the  look  direction. 
The  cost  of  demanding  this  response  is  the  loss  of  J  degrees  of  freedom.  Thus,  only  KJ-J 
degrees  of  freedom  in  choosing  the  weight  values  may  be  used  to  minimize  the  total  output 
power  from  the  array  processor.  Minimization  of  the  total  array  output  power  subject  to  our 


W2  Q.  wsQ  wbQ 


t1  Q  12  Q  13  Q 

f1=W1+W2+W3 

I2=W4+W5+W6 

t3xW7*W6tW9 


Rgure  16  Equivalent  Look  Direction  Processor 
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constraint  is  equivalent  to  minimizing  all  non-look  direction  noises  as  long  as  the  desired 
signal  is  uncorrelated  with  the  noise. 

In  terms  of  the  signal  and  weight  vectors,  the  look  direction  constraint  may  be 
expressed  as 


C^W=F 


where  the  KJxJ  matrix  C  and  the  J  vector  F 


1*  0 


effectively  define  the  sums  of  the  weights  on  each  column  of  taps. 

The  output  of  the  processor  at  the  k-th  sample  is 

y(k)='w\k)  X(k)  (2-6) 

and  the  expected  value  of  the  array  output  power  is 

=  (2-7) 

The  problem  at  hand  is  to  find  the  optimum  weight  vector  W  that  will  minimize  the  array 
output  power  (a  scalar  performance  index) 


subject  to  the  constraint 


g  =  C‘W-F=0 


The  constrained  optimization  problem  can  be  reduced  to  an  unconstrained  problem  through 
the  use  of  Lagrange  multipliers  [  10].  The  Lagrangian  for  this  system  is 


H=L  +  yg  R„W  +  \^(C^W~F  ) 


-  -  /  / 
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where  X  e  is  an  undetermined  vector  Lagrange  multiplier  The  necessary  conditions  *or 
optimality  are 


Ww  =  |^=0  and  C^W=F 

Taking  the  gradient  of  the  Lagrangian 

Hw  =  RxxW+CX  =  0 

and  solving  for  the  optimal  weight  vector  yields 

Wopi  =  -RZ,^C  X 

Since  Wopt  must  satisfy  uie  consu-aint 

X  =  F 

we  find  that  the  optimal  value  of  the  Lagrange  multiplier  is 

X=-tc’‘««‘cr*F 

and  the  optimal  weight  vector  may  be  written  as 


(2-11) 


(2-12) 


(2-13) 


(2-14) 


(2-15) 


(2-16) 


ITie  constrained  least  squares  estimate  of  the  look  direction  signal  is  given  by 

yopi(.k)=Wj'p,X(k)  (2-17) 

If  we  now  assume  for  the  time  being  that  Rxx  is  known,  we  may  derive  the 
deterministic  constrained  least  mean  square  (CLMS)  algorithm.  The  initial  weight  vector 
is  to  be  initialized  on  the  constraint  plane  and  subsequently  moved  in  the  direction  of  the 
negative  gradient  at  each  iteration.  The  adaptive  step  size  used  to  walk  down  the 
performance  surface  is  proportional  to  the  magnitude  of  the  gradient.  The  weight  state  and 
costate  equations- are  given  by 

W(k+l)=W{k)-\iHw  =W(k)-\i[R„W(k)  +  C  X{k)]  (2-18) 
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(2-19) 


X(*)  =  y(C^C)''(/^-C^H'(i(:))-(C^C  y^C^RxxWik) 

The  term  F  -C  lV(/:)inthe  costate  equation  permits  the  algorithm  to  make  corrections 
for  small  deviations  on  the  constraint,  preventing  large  errors  due  to  the  accumulation  of 
slight  trajectory  flaws.  Substituting  the  weight  costate  equation  into  the  weight  state 
equation  yields  the  CLMS  algorithm 

^(*+1)  =  /*  (W(Jt)-n«„W{/t)  ]  +  e  (2-20) 

where 

p=hj-c  (c^c  r'c^  (2-21) 

and 

Q^C(C^C)'^F  (2-22) 

This  deterministic  solution  requires  the  a  priori  knowledge  of  the  signal  correlation  matrix 

Rxx  .  The  stochastic  CLMS  algorithm  is  obtained  by  estimating  the  correlation  matrix  at 
each  iteration.  A  readily  available  estimate  is  formed  at  the  k-th  iteration  by 

Rxi  =  X(.k)X^(k)  (2-23) 

where  we  note  that  other  estimates  of  the  correlation  matrix  may  have  been  used  in  place  of 
equation  (2-23).  This  form  of  the  estimate,  which  is  made  up  of  the  outer  product  of  the 

i 

available  tap  voltage  vector,  is  |he  simplest,  and  is  consistent  with  the  LMS  algorithm. 

1 

Thus,  from  equations  (2-20)  and  (2-23),  we  find  that  the  weight  update  equation  may  now 
be  written  \ 

W(t:+l)  =  f  [W'(*)-ny(i(:)2r(/t)  )  +  e  (2-24) 

For  later  comparison,  a  signal  flow  block  diagram  is  presented  in  figure  17.  Note 
that  the  pre-steering  filters  for  the  signal  aligned  array  are  not  shown  in  block  diagrams.  For 
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Figure  17  Direct  Form  Biock  Diagram 


die  remainder  of  this  research  we  will  be  solely  concerned  with  linearly  constrained 
Minimum  Variance  Distortionless  Response  (MVDR)  adaptive  array  sensors.  In  terms  of 
the  direct  form  processor,  this  may  be  realized  by  enforcing  the  condition  that  the  vector 
F  be  composed  of  one  reference  element  equal  to  unity  and  the  remaining  elements  equal 
to  zero.  All  of  the  results  presented,  however,  will  be  general  in  nature  and  not  reliant  upon 
this  emphasis  unless  explicitly  stated. 

2)  Geometrical  Interpretation:  The  geometrical  analysis  of  the  CLMS  algorithm 
requires  a  number  of  definitions  and  propositions.  These  are  presented  in  appendix  I.  The 
CLMS  algorithm  will  new  be  shown  to  have  a  very  simple  representation.  In  addition,  the 
relationship  between  the  CLMS  algorithm  and  the  standard  LMS  algorithm  will  become 
evident. 

We  now  consider  the  geometrical  interpretation  of  the  CLMS  algorithm.  Consider 
the  diagram  shown  in  figure  18.  The  subspace  Z  is  that  subspace  which  satisfies  the 
homogeneous  form  of  the  constraint  equation.  Thus,  Z  is  the  nullspace  of  the  matrix  C 
defined  in  equation  (2-5).  This  subspace  will  be  termed  the  homogeneous  constraint 


32 


Suhstiluling  the  weight  eostate  equation  into  the  equation  for  Wop,  yields  the  vector  w-hich 
we  seek: 

Wop,  =  C{C'  Cf'  F  (2-29) 

This  vector  is  identical  to  the  vector  Q  in  the  CLMS  algorithm,  equation  (2-22). 

It  can  easily  he  verified  that  Q  is  orthogonal  to  any  vector  z  in  Z  by  examining  the 
inner  pnxluct 

Q^z  =  F^(C^C)''c^z  (2-30) 

and  noting  that  C'^z  =  0  by  definition. 

Consider  the  projection  operation  (defined  in  definition  Al-6  and  characteria:d  by 
propositions  A 1-2  and  A1-.3  of  appendix  I).  A  vector  W  may  be  decompo,sed  as  the  sum 
of  one  vector  in  Z  and  one  vector  from  the  orthogonal  space  spanned  by  the  J  linearly 
independent  columas  of  the  constraint  matrix  C  .  termed  the  constraint  subspace  NP.  The 
projection  operator  of  interest  acts  as  an  identity  operator  on  components  within  Z  and  as 
an  annihilator  operator  on  components  in  4*. 

The  matrix  F  in  equation  (2-21 )  can  be  .seen  to  be  a  projection  operator  onto  Z  by 
noting  that 

)  =  C^(/-C(C''C)*’C)  w=o  (2-31) 

and  therefore  PW  €  Z  .  We  also  note  that 

\i-r)W^w-rw  =  C(C^cr'c^w  =  Qi  y  (2-32) 

We  now  rewnte  the  CLMS  algorithm  (20)  for  convenience: 

»»•(*♦  I  )  =  /»i»'(*)-My(t)^(/t)  l  +  g  (2-33) 

The  term  inside  the  bracket  is  the  standard  LMS  algorithm  [23)  with  the  desired  signal 
d(k)=0.  and  the  expression  \(k)X  (k)  is  an  estimate  of  the  unconstrained  negative  gradient 
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at  time  k.  The  CLMS  algorithm  thus  computes  the  LMS  estimate  and  projects  the  resulting 
vector  onto  the  subspace  Z  .  The  next  weight  vector  W  (;t+l)  is  then  formed  by  translating 
the  projected  vector  onto  the  consu-aint  surface  £2  through  the  vector  addition  with  Q.  This 
is  depicted  in  figure  18. 


3)  Convergence  of  the  CLMS  algorithm:  The  weight  vector  W  (k)  in  the  CLMS 
algorithm  is  a  function  of  W  (0)  and  the  sequence  {X  (it)}.  Throughout  this  development  we 
have  assumed  that  the  observation  vectors  X  (it)  are  independent.  We  note  that  Frost  [11] 
has  pointed  out  that  this  may  be  unneces.sary  and  Daniell  [21]  has  shown  e-convergence 
based  on  only  asymptotic  independence.  Utilizing  this  assumption,  X  (it)  is  independent  of 
W  (it)  and  the  expected  value  of  the  CLMS  algorithm  may  be  written  as 

£lW(it+l)]=/’(£(W(it)l-M?„£lW(t)]  )  +  G  (2-34) 

From  proposidon  A 1-2  of  appendix  I,  we  may  express  Q  as 

Q^{I-F)Wop,  (2-35) 

Let  (p(it+l)  denote  the  difference  between  the  mean  adaptive  weight  vector  at  dme  k+1  and 
the  opdmal  weight  vector: 

9(it+l)  =  £lW(it+!))- (2-36) 
Proposidon  II- 1:  Let  W«p  and  Wop,  be  elements  of  £2  with  the  difference  vector 
<P  =  Wexp  -  Wop,.  Then  <p  e  Z  and  Pep  =  <p. 

Proof  of  Propo.sidon  11- 1:  Since  Wexp  ,  Wop,  e  £2,  then 

=  C^Wexp  -  dWop,  =  F  -  F  =  0  (2-37) 

and  <p  €  Z.  By  proposidon  A 1-2  and  definidon  A 1-6  of  appendix  I,  if  ep  €  Z  then  Pep  =  ep. 
This  may  be  shown  algebraically: 
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/•  (P  =  [  /  -  C  (  C^C  r '  I  (P  =  (p  -  0  =  <p. 

With  the  results  from  proposition  II-l,  we  may  rewrite  the  iterative  difference  process 
as 

9{k+l)  =  P{ElW(k)]-uRxxE[W(k)])-^ll-P]Wap,-'Wop,  (2-38) 

^Pip(k)-uPIljcxtp(i)  =lI-uPSxxPm)  =[/-«/»/?„/’]**'(p(0)  <2-39) 

where  we  have  used  the  fact  that,  from  equations  (2-16)  and  (2-21),  PRxxWop,  =  0. 

The  matrix  PUxxP  is  the  correlation  matrix  of  the  projected  observations.  It  is  the 
non-zero  eigenvalues  of  this  matrix  which  determine  both  the  convergence  rate  of  the  CLMS 
algorithm  and  the  performance  of  its  steady-state  with  respect  to  the  optimum.  This 
projected  correlation  matrix  P  RxxP  e  and  is  symmetric.  Hence,  it  is  diagonalizable 
into  n  orthogonal  eigenvectors. 

Proposition  II-2:  m  of  the  n  eigenvectors  of  the  matrix  P  RxxP  are  outside  of  the 
subspace  Z  and  have  zero  eigenvalues.  The  remaining  n-m  eigenvectors  have  non-zero 
eigenvalues  and  lie  within  Z. 

Proof  of  Proposition  II-2:  The  matrices  P  =  /  -  C  ( C^C  )''C^  and  C  have  full  rank. 
Therefore,  C  has  m  columns  of  linearly  independent  vectors.  It  is  evident  that  the  product 
C^PRxxP  =  0  since  P  P  ^  e  Z.  Thus,  m  columns  of  C  are  eigenvectors  of  P  P«P  with 
zero  eigenvalues. 

The  columns  of  C  are  orthogonal  to  Z.  Therefore,  the  remaining  (n-m)  eigenvectors 
must  be  in  Z.  From  proposition  II- 2,  if  9  e  Z  then  P  9  =  9.  Thus,  if  Vj  is  an  eigenvector  of 
P  RxxP  in  Z,  then 

vfP  RxxP  Vi  =  vTRxx  Vi  >  0  (2-40) 
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furthermore,  if  a,  is  an  eigenvalue  corresponding  to  v,  e  I,  then  PRxxPvt  =  Cj  v,  and 

ViPRxxl'vi  =a,vfvi  =ai  (2-41) 

We  now  consider  the  relationship  between  the  eigenvalues  of  the  constrained  system 
and  those  of  the  unconstrained  correlation  mauix.  We  continue  the  notation  used  to  denote 
the  (n-m)  eigenvalues  of  P  RxxP  as  Gi  and  will  denote  the  n  non-zero  eigenvalues  of  Rxx  as 
h.  The  well  known  result,  generally  referred  to  as  Rayleigh’s  Theorem  [22]  states; 


XmiB  ^  Omin  ^  Oi  i  Onax  ^max  (2-42) 

The  difference  vector  (p(0),  defined  in  the  previous  section,  lies  entirely  within  Z  and 
may  therefore  be  considered  as  a  linear  combination  of  the  eigenvectors  V;  which  correspond 
to  the  (n-m)  non-trivial  eigenvalues  of  P  RxxP .  Thus, 


<p(J(+ 1)  =  (  /  -  iiPRxxP  n  =  [  1  -  ^  Oi  l*^'vi 


(2-43) 


The  convergence  along  any  eigenvector  v,  is  then  geometric  with  ratio  [  1  -  pad  and 
associated  time  constant 

-I _  (2-44) 


Ti  = 


In(l-uOj) 


if  uo, «  1,  then  r—r- — r  -*  — — ^  and  it  becomes  evident  that  if  u  is  chosen  so  that 
^  ln(l-pa,)  pa, 


1 


(2-45) 


Omax 


then  the  norm  of  the  difference  vector  is  bounded  between  two  monotonically  decreasing 
geometric  progressions 

1  1  -  U  Omax  1***1  9(0)1  i  I  <p(t:+l)l  S  { 1  -  U  Ominl***l  9(0)1 


(2-46) 


Therefore,  if  the  initial  difference  is  finite,  then  £[  W  ]  converges  to  the  optimum  with  the 
time  constants  for  the  geometric  ratio  given  in  equation  (2-44)  above.  That  is  to  say,  the 
weight  vector  converges  in  the  sense  that 

limi  E[  W(k)  -  Wop,  )l  =  0  (247) 


The  operation  of  the  CLMS  algorithm  in  a  quasi-stationary  environment  is  now 
considered.  The  algorithm  step  size  p  is  assumed  to  remain  constant  during  this 
development  The  weight  vector  adaptadon  results  in  a  non-zero  variance  about  its  optimal 
value.  This  adds  an  addidonal  cost,  termed  ’misadjustment"  by  Widrow  [14].  This 
dimensionless  quandty,  denoted  A/(p),  is  a  measure  of  how  closely  the  algorithm  approaches 
the  opdmal  performance. 


Af(n)  =  lim- 

k-tO 


(248) 


Frost  [1 1]  has  shown  that  the  steady-state  misadjustment  may  be  bounded  by 


f  trace  (PRxxP)  '' 

trace  (PRxxP) 

1  -^[  trace  {PRxxP)  +  2omin} 

1  -  trace  [PRxxP)  +  2om«] 

(249) 


Thus,  A/(p)  can  be  made  arbitrarily  close  to  zero  by  choosing  a  small  step  size.  The 
fundamental  trade  off  is  cost  performance  at  the  expense  of  increased  convergence  dme. 


B.  Parddoned  Form  Constrained  Processor 


The  algorithm  for  the  direct  form  TDL  structure  clearly  shows  two  di.scernible 
events  taking  place.  The  first  event  is  the  adapdve  process  which  determines  the  LMS 
solution  for  the  adaptivity.  The  second  event  is  the  enforcement  of  the  constraint  each 
iteration,  which  is  both  deterministic  and  decided  upon  before  implementation.  This 
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Figure  19  Partitioned  CLMS  Processor 


algorithm  partitioning  was  also  depicted  in  the  preceding  geometrical  interpretation  and 
figure  18. 

Noting  this  separation  in  the  algorithm,  it  would  seem  reasonable  that  one  could 
change  the  form  of  the  processor  in  order  to  partition  the  two  aforementioned  events.  This 
is  the  approach  that  was  taken  by  Griffiths  [15]  and  now  explained.  The  adaptive  section 
of  the  processor  would  operate  in  the  subspace  I  and  the  non-adaptive  section  would  perform 
the  translation  to  the  subspace  Q,  satisfying  the  constraint  Thus,  this  structure  separates 
the  conventional  beamformer  from  the  adaptive  processor.  As  shown  in  figure  19,  the 
partitioned  processor  described  above  is  implemented  so  that 

C^Wc  =  F  (2-50) 
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and 


C^Wpf=0  (2-51) 

where  Wpf  represents  the  adaptive  weights  in  a  manner  analogous  to  section  II.  A.  1  and 
Wc  is  a  conventional  non-adaptive  beamforming  filter.  Equation  (2-51)  ensures  that  the 
desired  signal  is  eliminated  from  the  adaptive  processor  input.  Equations  (2- .50)  and  (2-51) 
can  be  seen  to  satisfy  equation  (2-4)  by  considering  the  global  partitioned  weight  vector 


and  noting  that 


The  partitioned  processor  shown  in  figure  19  thus  utilizes  the  CLMS  algorithm: 


subject  to  the  constraint 


min  WpF  Rxx  ^pf 


C^WPF=0 


The  optimal  adaptive  weight  vector  WpF„p,  for  the  partitioned  processor  is  now  derived. 


The  output  of  the  processor  is 


and  the  Lagrangian  for  this  system  is 


)J(C^WpF ) 


llie  necessary  conditions  for  optimality  are 


//»v,F=g^=0  and  C^Wff  =  0 
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Taking  the  gradient  of  the  Lagrangian 

Hw„  =  -R  «  -  W/.F^+  C)l  =  0 

and  solving  for  the  optimal  weight  vector  yields 

W/>F.„=  Wc-/f«  C  X 

Since  Wopt  must  satisfy  the  constraint 

C^(W.-R«'c  X^=o 

we  find  that  the  optimal  value  of  the  Lagrange  multiplier  is 

X=[C^r:x'c  v'dWc 


(2-59) 


(2-60) 


(2-61) 


(2-62) 


(2-63) 


IS 


and  the  optimal  weight  vector  may  be  written  as 

yvpFop,  =  Wc  -  RxxC  { c  )■ '  dwc 

which  can  be  seen  to  be  equivalent  to  equation  (2-16)  by  considering  -  ^^PFop^  It 
apparent  that  the  array  transient  response  is  unchanged  from  that  of  the  direct  form  due  to 

i 

the  partitioned  form’s  dependence  upon  the  eigenvalues  of  the  same  matrix  PRxxP.  Since 

I 

the  partitioned  processor  has  the  same  transient  behavior  and  the  optimal  weight  vector  for 
the  processor  is  equivalent  to  the  direct  form,  the  two  processors  have  identical  performance. 


r.  Generalized  Sidelohe  Canceler  (GSC) 


The  GSC  form  was  first  introduced  by  Applebaum  and  Chapman  [12]  for 
narrowband  signals  in  a  radar  context  The  derivation  in  this  section  will  follow  the 
extensions  made  by  Griffiths  [13]  for  wideband  signals.  The  iterative  equation  (2-24)  may 
be  partitioned  into  a  scalar  nonadaptive  equation  and  a  mauix  adaptive  equation.  To  derive 
this  particular  form,  we  will  begin  with  the  direct  form  processor  CLMS  algorithm,  and 
partition  it  in  a  manner  which  will  utilize  a  matrix  filter  to  ensure  that  the  adaptive  processor 
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operates  on  the  homogeneous  constraint  subspace  Z.  This  means  that  the  processor  does 
not  require  the  constraint  C^W  =  0  for  the  adaptive  section.  In  other  words,  the  constrained 
processor  operates  with  an  unconstrained  adaptive  algorithm. 

We  note  that  equation  (2-24)  can  be  written  for  the  l-th  column  as 


We  proceed  by  defining  a  K-dimensional  constfaint  vector  Wc  for  the  MVDR  processor 
such  that 

W.-il  «■«) 

A 

and  a  (K-1)  X  J  dimensional  signal  blocking  matrix  Wj  composed  of  linearly  independent 
rows  r  ( i )  such  that 

r^(  1)1=0  i=l...Ar-l  (2-66) 

Then  we  introduce  the  invertible  transformation  matrix  T  siich  that 


where  the  KJ-J  x  KJ  matrix  and  the  KJ  x  1  vector  V^c  are  given  by 


'Wx  0  0  ...  0‘ 

Wc' 

0  W,  0  ...  0 

0 

0  0  Wx  .  .  .  0 

0 

. 

}^C  = 

• 

0  0  0  ...  w. 

u 

and  finally,  from  (2-65)  and  (2-66)  we  note 


1 

0 


OJ 


(2-67) 


(2-68) 


(2-69) 
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Multiplying  the  stochastic  CLMS  algorithm  by  the  transformation  matrix  T ,  using 
equation  (2-69),  and  defining 

a(k)=’^J'WDF{k)  (2-70) 

B(k)  =  -^.WDF{k)  (2-71) 

we  find  that  equation  (2-64)  can  be  expressed  as 


a(k+l)]  ra(il:)]  (wjx 

-  =  -  +)iy(A:) _ _ _ 

B(t:+1)  B(k)  -W,X 


]  Wdf <.k)i-fi 


Substituting  equations  (2-65)  and  (2-4)  into  equation  (2-72)  yields  nonadaptive 


scalar  equation 


and  the  adaptive  mauix  equation 


a{il:+  1  )  =  a(  it) 


B(k*l)  =  B(k)  +  ny(k)X.{k)  (2-74) 

Equation  (2-74)  is  the  standard  unconstrained  LMS  algorithm  and  j  is  defined  by 

^,{k)='^.X{k)  or  X.{k)=WsX:{k-lt)  (2-75) 

where  Xt(k-IA)  was  defined  in  equation  (1-26). 

As  long  as  the  nonadaptive  scalar  weights  a  (k)  satisfy 

ai(k)  =//  (2-76) 

then  equations  (2-73)  and  (2-74)  define  the  stochastic  CLMS  algorithm  with  the  constraint 
explicitly  separated  from  the  adaptive  beamformer.  This  is  the  Generalized  Sidelobe 
Canceller  (GSC)  form  of  the  CLMS  array,  and  is  shown  in  figure  20,  where  for  the  directional 
constraint  of  interest  X  =  K-l. 

The  LMS  algorithm  in  equation  (2-74)  is  a  direct  function  of  the  CLMS  algorithm 
weight  vector  in  equations  (2-24)  and  (2-64).  In  other  words,  equation  (2-74)  presents  a 
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Signal  astimata 


Rgure  20  GSC  Form  Tapped-Delay-Line  Processor 


method  of  representing  the  CLMS  algorithm  with  respect  to  the  GSC  form.  In  general,  the 
GSC  algorithm  weights  are  not  a  function  of  the  CLMS  algorithm,  but  are  updated  by 
equation  (2-74)  such  that  the  Wi  in  figure  20  are  not  constrained 

W{k)*WDFik)  (2-77) 

and  the  nonadaptive  weights  shown  satisfy  equation  (2-76). 

The  question  of  interest  is  the  behavior  of  the  GSC  form  as  a  function  of  the  signal 
blocking  matrix  Wj  and  the  unconstrained  weight  vector  i^csc  (it)  =  W  (it).  It  is  shown  in 
appendix  II  that  the  optimal  weight  vector  ^csc^p,  is  equivalent  to  the  other  two  forms 
considered  when  equations  (2-65)  and  (2-66)  are  satisfied,  ie: 


This  leads  to  the  conclusion  that  as  long  as  the  signal  blocking  matrix  forces  the  adaptive 
processor  to  function  in  the  subspace  1  (blocks  the  look-direction  signal)  and  the 
nonadaptive  weights  enforce  the  constraint,  then  the  steady-state  solution  is  equivalent 
across  all  three  processor  forms.  It  should  be  noted  that  this  equivalent  steady-state  response 
is  achieved  through  not  only  an  unconstrained  adaptation,  but  also  with  a  reduced  order 
adaptive  processor.  For  the  directional  constraint  of  interest,  the  processor  need  only  operate 
on  the  transformed  observation  vector  ks  of  dimension  (KJ-J)  x  1.  The  transient  behavior 
of  the  GSC  form,  however,  may  or  may  not  be  identical. 

Thus,  the  GSC  can  provide  filtering  operations  which  are  either  identical  or  different 
to  those  of  the  CLMS  array  depending  on  the  structure  of  the  signal  blocking  matrix  Ws  and 
the  constraint  vector  Wc  .  If  the  linearly  independent  rows  of  satisfy  equation  (2-66), 
are  orthogonal 

r^(/)r(y)  =  0,  i^j  (2-79) 

and  Wc  satisfies  equation  (2-65),  then  the  .stocha.stic  CLMS  processor  is  obtained,  as  detailed 
in  appendix  n  and  explained  from  a  geometrical  point  of  view  below.  Furthermore,  if  Ws 
satisfies  equation  (2-66)  but  not  equation  (2-79)  and/or  Wc  does  not  satisfy  equation  (2-65), 
then  a  processor  is  formed  which  will  have  the  same  steady-state  performance,  but  different 
transient  trajectories. 

The  fact  that  the  GSC  form  reduces  to  the  Cl-MS  form  if  equation  (2-79)  is  enforced 
is  evident  by  considering  the  geometrical  interpretation  of  .section  II.A.2.  With  a  fixed  HV, 
it  is  clear  that  the  orthogonality  condition  for  the  rows  of  Ws  is  equivalent  to  decom.posing 
the  GSC  such  that 

^c  =  Q  (2-80) 
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<V,<Vc5c(*)  =  -/’[  <VG5r{<:)  +  H.v(t)Jr(;)]  (2-81) 


where  P  and  Q  are  defined  in  equations  (2-21)  and  (2-22),  and  the  geomeuical  relationship 
was  shown  in  figure  18.  The  vector  '^c  =  Q  can  not  be  changed  in  order  to  keep  the  desired 
MVDR  processor.  Since  Q  spans  T,  in  our  search  for  quicker  transient  behavior  we  are 
restricted  to  examine  relationships  which  are  non-orthogonal  combinations  of  Q  with 
iVj  e  I.  The  weight  vector  at  each  iteration  must  still  be  an  element  of  fl. 

As  an  example,  assume  a  linear  array  with  K=J=4  and  Wc  satisfying  equation  (2-65). 
Then  if  we  choose  a  Wi,  such  that'it  is  composed  of  mutually  orthogonal  row  vectors  which 

i 

are  binary  Walsh  functions  [24] 


1 

-1 

1 

-I' 

;  W„  = 

1 

1 

-1 

-1 

1 

1 

1 

, 

-1 

-1 

1 

(2-82) 


then  we  obtain  a  processor  identical  to  the  CLMS  direct  form.  By  contrast,  if  we  choose  a 

I 

non-orthogonal  such  that  it  forms  the  sum  and  difference  of  the  adjacent  channels 


1 

-1 

0 

o' 

W.2  = 

0 

1 

-1 

0 

0 

0 

1 

-1 

• 

- 

(2-83) 


then  we  obtain  the  processor  described  by  Applebaum  and  Chapman  [12]. 

The  capability  of  finding  a  Wj  which  implements  a  rectangular  quadratic 
transformation  Rxs  =  Rxx  W/  such  that  the  resulting  matrix  has  a  smaller  eigenvalue 
spread  than  the  original  correlation  matrix  PRxxP  is  intriguing.  This  capability  means  that 
the  GSC  adaptive  processor  may  converge  faster  than  the  CLMS  form.  The  quickest 
convergence  possible  would  be  realized  if  the  rectangular  quadratic  transformation 
equalized  the  eigenvalues  of  the  input  correlation  matrix.  However,  due  to  the  form  of 
K  given  in  equation  (2-68),  it  is  not  in  general  possible  to  determine  a  Ws  which  satisfies 


\ 
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the  requirement  of  equation  (2-66)  and  equalizes  the  eigenvalues  of  the  resulting  matrix 
Rxs-  The  best  that  can  be  achieved  is  to  find  a  signal  blocking  matrix  W5  which  causes  the 
smallest  eigenvalue  spread  of  Rxs-  The  dilemma  is  that  even  this  requires  unavailable  a 
priori  information  about  the  original  correlation  matrix  itself.  Thus,  while  one  may 
guarantee  identical  transient  behavior  for  the  GSC  and  the  direct  form  procc.ssor,  there  is  no 
deterministic  method  available  to  ensure  the  capability  of  a  better  dynamic  respon.se. 

The  utility  of  the  GSC  form,  however,  lies  in  the  fact  that  the  matrix  filter  partitioning 
permits  the  use  of  other  adapdve  structures  to  replace  the  tapped-delay-line  unconstrained 
processor  and  reduces  both  the  dimensionality  of  the  adaptive  weight  vector  and  the 
algorithmic  complexity.  It  is  precisely  these  capabilities  which  the  remainder  of  this 
research  will  concentrate  on,  with  the  goal  of  reducing  the  adaptive  processor  convergence 
time. 

D.  Normalization  of  the  Step  Size  Gain 

The  CLMS  and  the  LMS  algorithms  used  in  the.  direct  form  and  GSC  form  arrays, 
respectively,  utilize  a  constant  step  size  gain  denoted  p.  This  gain  is  dependent  upon  the 
input  signal  power  as  aescribed  in  section  II.A.3,  where  the  CLMS  misadjustment  was 
derived.  This  dependence  is  undesirable  seeing  as  p  remains  constant  while  the  signal  power 
changes  over  time.  This  undesirable  dependency  may  also  be  seen  by  considering  an 
increase  As  in  the  input  signal.  For  the  LMS  algorithm,  this  is  equivalent  to  increasing  the 

gain  from  p  to  pfA^)  ;  which  is  the  same  increa.se  observed  in  the  eigenvalues  of  the  signal 
correlation  matrix.  If  the  signal  power  grows  too  large  the  adaptation  algorithm  may  become 
unstable. 
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Orx'  nicthiKl  of  Nolvini;  iho  prohlcm  described  above  is  lo  implement  a  lime  varying 
'tep  M/e  gain.  ITie  use  v>l  lime- var\  mg  step  sizes  lor  the  LMS  algorithm  has  been  considered 
b>  mans  aulhttrs  tor  the  narrv>ss  band  single  channel  TDL  adaptive  I'iller  150.5 1 .52.531-  The 
dt'scK'pmenl  in  this  section  I'olhuss  the  development  in  Honig  and  Messerschmitl  1531  with 
the  utilization  of  different  parameters  and  e.xtensions  to  the  multichannel  wideband  case  of 
mica-st.  llte  step- size  shimld  be  normalized  to  the  input  signal  power  at  each  iteration. 
Ilius,  it  IS  desirable  to  change  iheCLMSand  LMS  algorithms  inequations  (2-24)  and  (2-74) 
to  the  form 

r'W<iM-(k)-ink)y(k)X[k)]*Q  (2-84) 


W/.Wa  (t  *  1 1  =  H/  ws  (f  I  ♦  ii(k)  yik)  X,ik)  (2-85) 

where  q(A)  is  the  afoa'mentioned  time- varying  .step  size.  The  magnitude  of  the  .step  .size 
each  Iteration  w  ill  K'  approximately  the  same  i>n  average  if  at  each  sample  it  is  normalized 
b>  an  estimate  »'f  the  input  signal  power.  For  the  l-th  T1)L.  this  may  be  exprcs.sed  as 

,  a  (2-86) 


where  o,‘  ik)  is  a  measure  of  the  power  content  of  the  mceived  signal  at  the  l-th  channel 
during  the  k  th  iteration  and  a  is  a  sealar. 

One  choice  for  forming  the  ptmer  estimate  is  through  the  use  of  a  single  pole 
low-pass  filter.  This  is  equivalent  to  using  an  exponentially  weighted  lime  average  of  the 
input  signal  power  aKive  each  lap  of  the  channel,  and  may  be  written  as 

«.•  (t  1  d  o.,*  a  I)  ‘  Xi'  Xi 


for  the  I  tb  channel.  The  selection  of  (1  such  that  0<P<1  controls  the  bandw  idth  of  the  filler 
and  the  resulting  power  averaging  time.  Let  a  =  1.  then  the  selection  of  p  such  that  p  =  1 
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yields 


liin£ 

ic-*®* 


1 


V^ik) 


ElX^(k)X(k)] 

l-P 


(2-88) 


We  now  consider  setting  a  =  1  -  p.  This  is  equivalent  to  letting 

£IOxV-)] 

which  is  reasonable  as  long  as  (1  -p)  is  small  enough  to  smooth  out  the  statistical 
fluctuations  in  the  step  size  so  that  becomes  virtually  independent  of  the  observation 

data  X(k).  If  the  step  size  at  k=fl  is  initialized  at  u(0)  =  — ~  —  ,  then  the  expected  value 

;i:^(0);ir(0)  ^ 


of  the  step  size  will  approximate  the  expected  value  of  for  all  k.  The  time  constant 

in  equation  (2-44)  will  become 


oiik) 


(2-90) 


E  [x'(k)xm  .=1 


(l-P)X,  /V(1-P)X, 

which  shows  the  proportionality  to  the  ratio  Therefore,  a  change  ir.  die  input  signal 


variance  causes  a  much  less  dramatic  change  in  the  convergence  speed  and  the 
misadjustment  is  reduced. 
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Figure  21  Signal  Spectrum 


Table  1  Signal  Characteristics 

SOURCE 

0 

POWER 

CENTER 

FREOUENCY 

BANDWIDTH 

desired  signal 

O” 

0.1 

0.3 /b 

0.1 

jammer  #1 

45" 

1.0 

0.2/0 

0.05 

jammer  #2 

■60" 

1.0 

0.07 

white  noise  oer  tan 

N/A 

0.1 

N/A 

E.  Example 


The  example  considered  in  this  section  is  based  on  a  lineaij  array  geometry  and  is 
similar  to  that  presented  by  Frost  [11],  The  array  is  composed  of  flj)ur  sensors  equispaced 
at  half-wavelength.  Each  tapped-delay-line  filter  has  four  taps.l  The  tap  spacing  A 
corresponds  to  a  frequency  of  /o  =  1 .0  (^c?  =  1  defines  a  frequency  of  1/A  Hz).  The  signal 
environment  is  characterized  by  one  desired  signal  and  two  active  jammers.  The  signals  are 
assumed  to  emanate  from  sources  in  the  far  field  of  the  array  and  impinging  from  directions 
Bd ,  0/1 ,  and  0/2  •  The  propagation  medium  is  assumed  to  be  linear  and  non-dispersive.  'fhe 
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desired  signal  direction 


Jammer  1  direction 

jammer  2  direction  ! 

0.13  0.2  0.23  0.3  0.33  0.4  0.43  0.3 

normalized  frequency 

Rgure  22  Optimal  Frequency  Response 

look-direction  signal  and  noises  are  assumed  to  be  statistically  uncorrelated  with  the  non 
look-direction  signals,  explicitly  ruling  out  multipath.  The  signal  environment  is  described 
in  table  1  and  figure  21.  The  power  special  density  depicted  in  figure  21  is  a  plot  of 
normalized  frequency  versus  power  in  dB. 

The  vector  of  look-direction  filter  coefficients  F  is  designed  to  provide  a 
distortionless  response.  This  desired  response  in  combination  with  our  directional  constraint 
yields  the  aforementioned  linearly  constrained  minimum  variance  distortionless  response 
(MVDR)  processor.  The  optimal  weight  vector  given  by  equation  (2-16)  produces  the 
frequency  response  shown  in  figure  22.  This  figure  is  acfjally  three  steady  state  plots  of 
normalized  frequency  versus  power  gain  superimposed  upon  one  another.  The  first  plot 
depicts  the  distortionless  look  direction  frequency  response,  while  the  second  and  third  plots 
are  the  frequency  responses  in  the  directions  of  jammer  1  and  jammer  2,  respectively. 
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The  ensemble  characteristics  with  the  known  correlation  matrix  of  the  direct  form 
and  GSC  form  TDL  structure  are  now  examined.  This  type  of  computer  investigation  is 
termed  an  analysis  model,  as  opposed  to  a  simulation  which  will  be  addressed  later.  The 
magnitude  of  the  step  size  determines  the  speed  of  convergence  and  the  steady  state 
mean-square  error  of  the  CLMS  and  LMS  algorithms.  The  upper  bound,  given  in  equations 
(2-45)  and  (1-50),  require  a  priori  knowledge  of  the  data  correlation  matrix.  A  step  size 
which  is  near  the  upper  bound  may  lead  to  overshoot  with  data  fluctuations  and  the 
mean-square  error  increases  proportionally  with  the  step  size  magnitude.  A  step  size  which 
is  too  small  will  increase  the  amount  of  time  required  to  convergence.  For  this  example,  the 
magnitude  of  the  step  size  is  taken  to  be 

I  (2-91) 

10  trace{R) 

where  R  is  the  relevant  correlation  matrix.  It  is  noted  that  this  step  size  is  the  value 
recommended  by  Wid'ow  [20,  p.  106]. 

Figure  23  shows  the  ensemble  average  weight  transients  resulting  from  iteratively 
applying  the  CLMS  algorithm  in  equation  (2-20).  Figure  24  depicts  the  ensemble  mean 
square  error  or  learning  curves.  Since  the  LMS  algorithm’s  desired  reference  signal  d(k)  is 
equal  to  zero  in  the  constrained  algorithms,  the  mean-square  error  is  equivalent  to  the 
ensemble  output  power;  that  is,  £[e^]  =  E[(d{k)  -  y{k)f]  -  ^[y^fit)].  The  GSC  form  requires 
the  choice  of  signal  blocking  matrix  be  made  for  further  analysis.  The  choice  of  Wj,  defined 
in  equation  (2-82)  yields  a  performance  identical  to  that  of  the  direct  form,  and  is  therefore 
not  presented.  Figures  25  and  26  present  the  easemble  weight  transients  and  output  power 
for  the  signal  blocking  matrix  Wsi  defined  in  equation  (2-83).  For  this  example,  the 
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Figure  23  Ensemble  TDL  Direct  Form  Weight  Vector 
Transients 


Figure  24  Ensemble  TDL  Direct  Form  Learning  Curve 


for  Signal  Blocking  Matrix  Ws2 
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eigenvalues  of  the  observation  correlation  mauix  range  from  0.100  to  10.056.  The 
eigenvalue  spread  of  the  direct  form  quadratic  maU’ix  function  PR^xP  is  75.447.  The 

eigenvalue  spread  of  the  GSC  quadratic  matrix  function  ^siRxx^Ji  is  57.8019,  depicting 
the  situation  where  the  GSC  performance  will  exceed  that  of  the  direct  form. 

The  simulation  model  analyzes  the  actual  behavior  of  the  adaptive  processor.  This 
is  accomplished  through  generating  all  propagating  signals  and  simulating  the  sensor  noise 
at  each  element.  The  signals  impinge  upon  the  array  as  they  would  in  a  uiie  deployed  system, 
and  the  statistics  are  data  driven.  This  differs  from  the  analysis  model,  where  simply  an 
investigation  and  iteration  of  equations  given  the  correlation  matrix  is  performed. 

The  simulation  of  the  tapped-delay-line  structure  for  both  the  direct  form  and  GSC 
form  is  now  considered.  The  propagating  signals  are  modeled  as  zero-mean  stochastic 
processes  with  Gaussian  distributions.  The  signals  are  filtered  to  provide  the  spectral  and 
spatial  characteristics  described  in  table  1  and  figure  21.  The  constant  step  size  simulations 
are  based  on  an  LMS  gain  as  described  in  equation  (2-91).  The  normalized  step  size 
simulations  are  formulated  as  described  in  section  D  of  this  chapter. 

The  simulation  executed  one  huiJred  adaptations  over  a  one  hundred  independent 
realizations  of  the  input  process.  The  same  observation  process  at  the  se  isor  inputs  were 
used  for  each  form  to  provide  a  meaningful  comparison.  The  direct  form  constant  step  size 
mean  weight  vector  trajectories,  enremble  averaged  output  power  and  frequency  response 
evaluated  at  the  mean  value  of  Wdf(IOI)  are  displayed  for  this  realization  in  figures  27, 28 
and  29.  The  analogous  plots  for  the  GSC  are  presented  in  figures  30,31  and  32. 

The  normalized  step  size  simulations  are  presented  for  the  same  observation  process 
realization  generated  above.  It  is  noted  that  the  ensemble  statistics  are  not  available  due  to 
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the  algorithm’s  new  dependency  upon  the  data  variance.  Approximations  to  the  ensemble 
performance  are  made  over  one  hundred  adaptations  of  the  observation  process  realizations. 

The  time-varying  LMS  algorithm  begins  with  an  initial  condition  of  the  input  signal 
variance,  which  must  be  large  for  fast  convergence  and  subsequently  decrease  in  magnitude 
for  minimum  steady  state  mean-square  error.  In  accordance  with  the  earlier  discussion  on 
convergence,  too  small  of  an  initial  variance  estimate  leads  to  a  large  initial  step  size  and  a 
corresponding  initial  overshoot.  A  value  of  the  initial  variance  estimate  which  yields  similar 
performance  to  that  exhibited  by  the  fixed  step  size  of  equation  (2-91)  has  been 
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weigh!  value 


Rgure  35  TDL  DF  Normalized  Gain  Frequency  Response 
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F.  Conclusions 


The  GSC  form  and  the  direct  form  TDL  structure  adaptive  array  sensor  have  identical 
dynamic  behavior  if  the  LMS  and  CLMS  step  sizes  are  constants  which  provide  the  same 
level  of  misadjustment,  the  GSC  form  constraint  enforced  through  the  conventional 
beamforming  matrix  Wc  is  equivalent  to  that  of  the  direct  form  algorithmic  constraint  and 
the  GSC  form  signal  blocking  matrix  IVj  is  composed  of  orthogonal  rows  which  map  the 
constraint  nullspace.  Since  the  emphasis  in  the  examples  was  a  MVDR  array  which  provides 
a  distortionless  look  direction  response,  the  constraint  was  consistent  across  all  forms  and 
examples.  The  difficulty  in  choosing  a  signal  blocking  matrix  composed  of  nonorthogonal 
rows  which  would  consistently  provide  a  better  dynamic  behavior  was  discussed  at  the  end 
of  section  II.C.  It  was  found  that  the  GSC  form  signal  blocking  matrix  implementing  an 
adjacent  element  subtraction  led  to  a  quicker  convergence  for  the  signal  and  array  geometry 
presented  in  the  examples  of  this  .section. 

The  LMS  algorithm  with  a  time-varying  step  size  presents  two  benefits  which  are 
of  importance  in  this  study.  First,  the  choice  of  step  size  selection  is  s'  nplified  due  to  its 
update  as  a  function  of  input  signal  variance  (or  equivalently,  the  eigenvalues  of  the  input 
proce.ss  correlation  matrix).  Second,  and  most  important,  this  normalization  of  the  step  size 
leads  to  a  reduction  in  the  dependency  between  the  speed  of  convergence  and  the 
eigenstructure  of  the  relevant  correlation  mauix.  This  is  realized  due  to  the  fact  that  the 
algorithm  can  now  update  each  weight  independently  with  separately  valued  time-varying 
step  sizfis.  Thus,  each  mode  of  the  algorithm  can  adapt  at  its  own  speed.  The  correlation 
matrices  which  then  determine  the  speed  of  convergence  are  then  given  by  P  £2df(^)  Pxx  P 
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and  Clcsc{k)  R  XX  iV/  where  £lf(A')  is  a  diagonal  matrix  of  the  proper  dimension  for  form 
F  whose  elements  at  time  k  are  given  by  the  individual  step  sizes  presented  in  equations 
(2-86)  and  (2-87).  For  the  example  presented,  the  eigenvalue  spread  of  the  normalized 
algorithms  and  that  of  the  standard  algorithm  were  nearly  identical.  This  is  believed  to  be 
due  to  the  modest  eigenvalue  spread  produced  by  this  example’s  geometry. 
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ill.  CONSTRAINED  PROCESSORS  WITH  ORTHOGONAL  FILTER  STRUCTURE 


This  chapter  investigates  the  performance  of  the  CSC  form  linearly  constrained 
wideband  adaptive  array  sensor  with  the  adaptive  processor  replaced  by  an  orthogonal  filter 
structure.  The  motivation  for  utilizing  an  orthogonal  filter  realization  of  the  adaptive 
processor  is  that  we  desire  to  obtain  a  new  set  of  data  vectors  which  exhibit  minimum 
correlation  to  provide  as  the  input  to  the  adaptive  filter. 

Consider  the  GSC  form  array  with  a  single  distortionless  constraint  as  shown  in 
figure  20  of  the  last  chapter.  The  operations  considered  in  this  chapter  are  reali7.ed  by 
transforming  the  data  present  on  each  tap  in  the  figure  prior  to  weighting.  This 
delay-and-transform  operation  is  conveniently  represented,  in  general,  by  the  transformation 
matrix  Q.  This  structure  is  depicted  in  figure  39.  It  will  be  seen  that  the  elements  of  the 


Figure  39  Transform  Domain  GSC  Processor  Structure 


transformation  matrix  may  be  adaptive  or  fixed.  The  goal  of  this  transformation,  as  stated 
initially,  is  to  provide  a  less  correlated  input  Z(k)  to  the  adaptive  weight  vector  WosHk). 

This  chapter  will  first  consider  the  case  of  a  fixed  orthogonal  transform  structure  in 
section  m.A,  where  the  Discrete  Fourier  Transform  (DFT)  and  the  Discrete  Cosine 
Transform  (DCT)  with  a  normalized  step  size  will  be  extended  to  the  multichannel  case  of 
interest.  Linear  prediction  will  be  reviewed  in  section  III.B  to  provide  the  foundation  for 
the  derivation  of  the  lattice  filter  structure  in  section  III.C.  The  converged  lattice  structure 
will  be  related  to  the  Gram-Schmidt  orthogonalization  process,  and  the  multichannel 
adaptive  lattice  structure  will  then  be  examined  as  a  replacement  for  the  TDL  processor. 
The  Gram-Schmidt  orthogonal  structure  will  then  be  derived  directly  in  section  III.D,  and 
the  characteristics  of  this  structure  in  the  GSC  form  processor  will  be  examined.  Simulations 
in  section  III.E  will  then  be  used  to  compare  the  performance  of  these  structures 

A.  Fixed  Orthogonal  Transform  Domain  Structure 

Ver>  early  in  the  history  of  adaptive  array  research  many  investigators  examined 
frequency  domain  LMS  filters  [31,35,36].  The  frequency  domain  transformation  is  usually 
implemented  with  Q  being  an  invertible  matrix  composed  of  the  DFT  or  the  DCT 
coefficients  which  operate  in  an  identical  manner  upon  each  TDL.  The  frequency  domain 
has  an  intuitive  appeal  as  a  method  of  improving  the  performance  of  wideband  adaptive 
arrays.  The  transformation  from  the  temporal  domain  to  the  frequency  domain  results  in 
frequency  subbanding,  in  effect  reducing  the  wideband  problem  to  discrete  frequency  bins. 
The  initial  research  in  this  area  was  limited  to  the  analysis  of  the  LMS  algorithm  with  a 
constant  step  size.  Compton  [29]  then  published  a  report  which  showed  that  the  frequency 
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domain  structure  performance  was  identical  to  that  of  the  tapped-delay-line  processor,  again 
utilizing  a  fixed  step  size  in  the  LMS  algorithm.  Subsequently,  many  other  researchers 
began  examining  the  use  of  transform  domain  adaptive  filtering  for  narrowband  single 
channel  applications  which  considered  the  use  of  time-varying  LMS  step  sizes 
[28,30,40,41,42,4.3,44,45,46,47], 

Section  ni.A.l  will  present  the  equivalence  of  invertible  linear  transforms  for 
completeness.  This  equivalence  directly  applies  to  the  DFT  and  DCT  processors  with  a 
constant  step  size  and  demonstrates  that  the  resultirig  array  transient  and  steady-state 
behavior  is  unchanged  by  the  transformation.  Section  in.A.2  will  then  be  concerned  with 
the  main  contribution  of  this  section,  the  wideband  multichannel  extension  of  the  transform 
domain  filtering  research  of  Narayan  [28],  Lee  [30],  Clark  [40]  and  Jenkins  [42].  Section 
III.A.3  will  present  the  DCT,  and  section  III.A.4  will  consider  the  convergence  of  these 
transform  domain  algorithms. 

1)  The  DFTTransfy.  'ii;  In  order  to  facilitate  the  following  development,  we  briefly 
return  to  the  TDL  structure  G.SC  in  order  to  abbreviate  notation.  Define  the  quadratic 
correlation  matrix  function  and  the  quiescent  response  vector  as 

=  (3-1) 

RxJ=^,R,x^c  (3-2) 

Utilizing  this  notation,  the  optimal  GSC  weight  vector  in  equation  (2-78)  may  be  written  in 
a  form  similar  to  that  of  the  Wiener-Hopf  equation  which  was  introduced  in  equation  (1-18) 

y¥ap,=^R-x}Rxd  (3-3) 


The  optimal  value  of  the  GSC  lower  path  output  can  be  written  as 

y,^.=  wlp,k,  =  ^lWop, 
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Now,  extending  the  derivation  of  Compton  129],  consider  the  CSC  form  array  with 
an  invertible  transform  Q  introduced  after  the  signal  blocking  matrix  and  before  the  adaptive 
processor.  We  note  that  any  reversible  operation  can  not  affect  the  performance  of  the  array. 
This  situation  is  clear  by  considering  the  new  data  vector  Z,  which  becomes  the  input  to  the 
adaptive  processor,  where 

Z{k)^QkAk)  (3-5) 

The  transform  domain  optimal  weight  vector  is  given  by 

yfQ.^=mxKfT^QRxd  (3-6) 

and  the  output  of  the  transform  domain  structure’s  lower  path  is 


^kW.Rxd 


(3-7) 


In  fact,  it  is  easily  shown  that  optimal  weight  vector  of  the  two  structures  are  related  by 

=  (3-8) 


Hoice,  the  transform  domain  GSC  processor  with  an  invertible  transform  and  the  TDL  GSC 
(Q-l)  both  converge  to  the  Wiener- Hopf  solution. 

Consider  the  symmetrical  DFT  which  is  implemented  upon  the  signals  present  at 
each  channel  of  the  blocking  matrix  output.  For  the  l-th  channel, 

J  (3-9) 

X,„e  y  ^  )  l=\,K  n=l,J 

m  » ] 

or,  equivalently,  for  the  stacked  (K-l)xJ  dimensional  data  vector  X  at  time  k 

Z(k)  =  Soft  kj(k)  (3-10) 
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and  Qdft  is  simply  the  rank  (K-l)xJ  matrix  of  exponential  coefficients  which  realize 
equation  (3-9).  This  algorithm  has  the  additional  benefit  of  not  requiring  an  inver.se 
transform  to  oDtain  the  tim;  domain  output  [24]. 

Due  to  the  symmetry  of  the  DFT  matrix,  we  may  write  equation  (3-8)  as 

Wqdft^QdftW  (3-11) 

and  the  weight  vector  behind  each  element  is  simply  the  inverse  DFT  of  the  TDL  structure 
weight  vector. 

The  transform  relation  of  equation  (3-11)  depicts  that  the  steady  state  value  of  the 
DFT  and  the  TDL  structure  are  the  same.  Following  Compton  [29],  we  now  show  through 
the  analysis  of  the  correlation  matrix  eigenstructure  that  the  transient  behavior  of  the  two 
arrays  are  also  identical  for  a  fixed  step  .size.  For  the  pre.sent  time,  we  will  be  concerned 
solely  with  the  frequency  domain  transform,  and  therefore  the  subscript  DFT  will  be 
suppressed. 

Define  the  transform  domain  correlation  matrix  as 

=  (3-12) 

where  the  operator  (.)  denotes  the  complex  conjugate.  Since  R.x,  and  Rxx  are  both  a.ssumed 
to  be  positive  definite  and  Hermitian,  Rx,  has  a  complete  set  of  orthonormal  eigenvectors 
whose  corresponding  eigenvalues  are  real  and  positive.  Denote  the  i-tli  eigenvalue  of  Rx, 

as  h  and  the  i-th  eigenvector  as  tp,.  Then  the  orthonormal  condition  may  be  explicitly  written 
as 

9rq»=6i,  (.3-13) 

We  proceed  by  defining  the  U-ansform  eigenvector 

=  (3-14) 
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and  note  that 


*pI  =  vf  V7  [e"‘f  ^  G''<p. 

Utilizing  the  symmetry  of  the  DFT  and  the  fact  that  the  transformation  matrix  is  realized  in 
a  block  diagonal  form  due  to  the  same  transform  taking  place  on  each  element,  we  find 

(3-16) 


e-'=7S 


(3-17) 


and  equation  (3- 15)  becomes 

9l9zi  =  J9fjQQ~\i  =  h 
Now,  consider  the  eigenvector  equation 

Substituting  equation  (3-14)  into  (3-19)  and  multiplying  by  j2‘'  yields 

and  using  the  symmetry  of  the  DFT,  equations  (3-16)  and  (3-20)  can  be  written  as 

=  (3-21) 

SO  that  it  is  evident  that 

fftzVzi  ~  J  <P;j  (3-22) 


(3-18) 


(3-19) 


(3-20) 


Then  each  eigenvalue  of  Rzz  is  simply  J  times  the  corresponding  eigenvalue  of and  we 
conclude  that  the  eigenvalue  spread  for  the  frequency  domain  structure  and  the  TDL 
structure  GSC  are  identical. 

2)  The  DFT  FreauencvL-Domain  Structure  with  Suhhand  Normalization;  Both 
Compton  [29]  and  Lee  and  Un  [30]  have  examined  the  performance  of  the  DFT  algorithm 
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presented  in  equation  (3-21).  Compton  concluded  that  the  TDL  and  DFT  structures  always 
perform  identically  in  his  analysis  of  single  and  multiple  channel  adaptive  filters.  His  work, 
however,  did  not  consider  normalization  of  the  adaptive  step  size.  Lee  and  Un  realized  the 
possibility  of  achieving  better  convergence  properties  through  the  normalization  of  the  step 
size,  as  have  Narayan  et  al.  [2S].  However,  both  of  the  latter  authors  restricted  their  analysis 
to  single  channel  filters  and  hence,  were  not  able  to  realize  the  normalization  conditions  for 
the  adaptive  algorithm  step  size  which  is  now  presented  to  yield  speed  of  convergence 
improvement  in  the  adaptive  array  sensor  problem.  Following  the  notation  of  Narayan,  we 
define  these  conditions  and  present  a  method  of  multichannel  variance  averaging  which 
results  in  better  dynamic  behavior  while  achieving  the  same  steady  state  Wiener  solution. 

Consider  the  normalized  step  siz.c  for  the  unconstrained  LMS  algorithm 

WiM)  =  W{k)  +  my(k)X,{k)  (3-23) 


where  at  time  k 


H(fc)  = 


JLrl 

OM.{k) 


(3-24) 


is  a  diagonal  matrix  composed  of  the  averaged  signal  variances.  We  note  that  the  averaging 
operation  for  the  TDL  structure  was  a  multichannel  extension  of  the  only  one  which  can  be 
considered  in  the  single  channel  case;  averaging  over  the  channel.  Thus,  for  the  n-th  channel 

oljik)  =  |3  a.i(k  -1)  + 


The  steady  state  convergence  of  the  normalized  step  size  LMS  algorithm  and  the  variance 
estimate  of  equations  (3-24)  and  (3-25)  were  considered  in  section  D  of  chapter  11. 

It  would  seem  reasonable  to  conclude  that  better  tfansient  behavior  would  occur 
when  the  weights  were  able  to  adapt  in  each  frequency  bin  through  a  normaliz.ed  step  size 
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in  which  the  variance  was  averaged  over  the  power  present  in  that  frequency  bin.  An 
algorithm  which  accomplishes  this  utilizes  the  estimate 


(3-26) 

-  P  1 )  +  ^  Zlik)  Zj.(k) 

(3-27) 

and  the  weight  state  update 

WDFTik+ 1 )  =  Wonik)  +  n(it)  >’(*)  Z(k)  (3-28) 

where /« is  the  n-th  ^equency  bin. 

j 

3)  The  Di.screte  Cosine  Transform:  The  DCT  has  the  computational  advantage  of 
using  only  real  numbers  to  provide  a  transform  of  the  input  data.  Further,  this  transform 

t 

i  _ 

was  chosen  for  comparison  to  the  DFT  since  recent  articles  in  the  literature  [24,  28,441 
reported  that  the  narrowband  single  channel  DCT  adaptive  filter  provided  better  results  than 

I 

the  DFT  and  other  orthogonal  transform  filters  for  a  class  of  data  used  in  speech  related 
applications. 

The  DCT  orthogonal  transform  for  the  l-th  channel  is  given  by 


msl 

J 


/=!  n=U 


kuc  n.XJ 

m  rt  I  V  J 


which  is  represented  at  time  k  by  the  (K-l)xJ  dimensional  vector 

Z(k)==QDCTt(k)  (3-30) 

and  goer  is  simply  the  rank  (K-l)xJ  matrix  of  real  coefficients  which  realize  equation  (3-29). 

4)  The  Transform  Domain  Convergence:  Th?  eed  of  convergence  of  the  weight 
vector  WorKk)  is  a  function  of  the  eigenvalue  spread  of  the  correlation  matrix 
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^«^’=n[2«x,e'']  (3-31) 

We  now  assume  without  any  loss  of  generality  that  the  observation  process  variance  is  unity. 
Since  the  trace  of  a  matrix  is  the  sum  of  iLs  spectfum,  for  any  square  matrix  H  we  can  say 

S  trace(R)  (3-32) 

Similarly,  the  determinant  of  a  matrix  is  the  product  of  its  spectrum.  For  any  positive  definite 
hermitian  matrix  whose  rank  is  greater  than  two,  it  can  be  shown  [28]  that 

2  det(P)  (3-33) 

Therefore,  an  upper  bound  for  the  eigenvalue  spread  can  be  expres.sed  as 

det(/J) 

From  equation  (3-31),  the  trace  and  determinant  are  expanded  as 

trac«(n  Rzz)  =  trace{Rx,)  =  KJ 
detqi  Riz)  -  det(n)dei(/?,,) 
and  from  equation  (3-34),  the  upper  bound  is  found  to  be 

KJ  1 

"  det(n)  det(/?»,)  "  det(|i) 

Since  the  input  process  variance  was  assumed  to  be  unity,  the  determinant  of  p,  will  be  less 
than  or  equal  to  unity.  Thus, 

^(^/l«)5^(/^,,)  (3-38) 

The  equality  condition  holds  when  p  is  normalized  through  the  averaging  of  element 
variances  and  the  subband  normalization  of  equations  (3-26)  through  (3-28)  improve  the 
transient  characteris  ics  of  the  sensor,  as  will  be  shown  in  the  simulations  at  the  end  of  this 
chapter. 


(3-34) 

(3-35) 

(3-36) 

(3-37) 
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B.  Linear  Prediction 


The  lattice  structure  to  be  derived  solves  the  adaptive  array  sensor  problem  by 
performing  two  optimum  estimation  operations  joindy.  The  first  is  linear  prediction,  which 
is  used  to  transform  the  correlated  inputs  into  a  corresponding  sequence  of  uncorrelated 
backward  error  predictions.  The  second  estimation  is  the  familiar  optimum  filtering 
operation  which  produces  the  estimate  of  the  desired  response,  or  equivalently,  the  array 
output.  We  now  derive  the  optimum  forward  and  backward  scalar  linear  predictors.  This 
development  follows  Haykin  [27]. 

1)  Forward  Linear  Predicdon:  The  forward  linear  prediction  problem  is  concerned 
with  predicting  a  future  value  of  a  stationaiy  discrete-time  stochastic  process  given  a  set  of 
past  sample  values  of  the  process.  Consider  the  time  series  {x(k),  x(k-l),...x(k-J)}  which  is 
composed  of  J+1  samples.  The  operation  of  linear  prediction  makes  an  estimate  of  x(k) 

given  the  sample  values  x(k-l)  through  x(k-J).  Let  X*-i  oenot’  t-dimensional  space 

spanned  by  {x(k-l),x(k-2),...x(k-J)}  andx(  k  1 )  denote  the  predicted  value  of  x(k)  given 
this  set  of  samples.  The  predicted  value  may,  in  general,  be  express'*  ’  rs  '^ome  function  0 
of  the  given  samples 

*(  *  I  X*-i )  =  Q(x(k-  l)JC(k-2),..a(k-J)) 

and  is  termed  linear  prediction  when  the  function  0  consists  of  a  linear  combination  of  the 
samples  in  the  form 

A  V  ^ 

Jt(  k  I  X*-l )  =  X  ~  ”) 
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The  forward  prediction  error  equals  the  difference  between  the  actual  sample  value 
x(k)  at  time  k  and  its  predicted  value  x(  k  I  Xk-i  )•  The  forward  prediction  error  is  denoted 
fi{k)  and  given  by 

Mk)  =  x(k)-i(k\X,_,)  ^3-41) 

where  the  subscript  /  signifies  the  order 

The  single  channel  forward  prediction  operation  is  depicted  in  figure  40.  The 
predictor  consists  of  J  unit-delays  and  J  tap  weights  Wot,  which  are  assumed  to  be 

optimized  in  the  mean-square  sense  and  fed  with  the  respective  delayed  samples  of  the 
observation  process.  The  resultant  output  is  the  predicted  value  of  x(k)  given  by  equation 
(3-40).  Then,  we  may  write  equation  (3-41)  as 

J 

fj(k)  -  x(k)  -  V  Won  x{k  -  n) 


(3-42) 


Let  Ojn ,  n=0.1,..  J  denote  the  tap  weight  values  of  a  new  T1!)L  filler  which  are  related  to  the 

tap  weights  of  the  forward  prediction  filler  as  follows: 

^  _f  I  «=0  (3-43) 

-HVn  n=l2...J 


Tlicn,  equation  (3-42)  may  be  expressed  as 


fAk)  =  Y^a,„x(k-n) 


(3-44) 


which  yields  the  filter  depicted  in  figure  41  and  is  terrned  a  forward  prediction-error  filter. 
2)  Backward  Linear  ftediction:  We  may  also  operate  on  the  time  series  {x(k), 

x(k-l),...x(k-J+l))  to  make  a  prediction  of  the  sample  x(k-J).  Let  X*  denote  the 

J-dimensional  space  spanned  by  {x(k).  x(k-l),...x(k-J+l9).  Then  we  may  write 

^  i  '  <3-4.S) 

x(  I  4*  )  =  ^  jf,  -  fi  +  1) 

fMl 
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to  represent  a  linear  predietion  of  the  sample  x(lt-J).  where  g  is  the  J-dimensional  vector  of 
tap  weights  whieh  are  also  assumed  to  he  optimized  in  the  mean-square  sense.  In  the  ease 
of  backward  prediction,  the  desired  response  is  given  by  d{k)=x(k-J)  and  the  backward  error 
equals  the  difference  between  the  actual  .sample  value  x(k-J)  and  its  predicted  value 

x{  k-J  I  X*  )•  The'  backward  prediction  error  is  given  by  hi(k)  where 

A  V 

b](k)  =  x(k-J)  -  .r(  k-J  I  X* ) 


and,  from  equation  (.'?-4.5),  we  may  write 


j 

bjik)  =  x(k  -J)  -  .r(*  -  «  +  1 ) 


(3-47) 


Defining  the  tap  weights  of  the  backward  prediction-error  filtei  in  terms  of  the  corresponding 
backward  predictor  weights  as 

/f=0.I,...y-l  (.3-48) 

I 


we  may  write  equation  (3-4.’)  as 


J 

bAk)  =  XO"  x(k  -  n) 
«=o 


(3-49) 


where  the  backward  predictor  and  backward  prediction-error  filter  arc  depicted  in  figures 
42  and  43,  re.spectively. 

3)  The  Solution  and  Relationship  of  Prediction  Weight  Vectors:  A.s,suming 
.stationarity,  the  correlation  matrix  for  both  the  forward  and  backward  processes  may  be 
expressed  as 

=  D-50) 
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Figure  42  Backward  Predictor 


Figure  43  Backward  Prediction  Error  Filter 


where  x  is  the  J-dimensional  vector  composed  of  the  observation  prwess  samples.  The 


cross-correlation  vectors  may  be  formulated  for  the  forward  and  backward  predictors  as 
and  r*,  respectively: 


x(k-\)x(k) 

>(-!)■ 

x{k-2)  x(k) 

- 

r(-2) 

x(k-J)x(k) 

/(-y) 

x(k)  x{k~J) 

■  riJ') 

x{k-l)  x(k-J) 

ay-1) 

£ 

= 

x(k-J+l)x{k-J) 

.  . 

(3-51) 


(3-52) 


The  solutions  to  the  forward  and  backward  linear  prediction  problems  are  given  by  the 
Wiener-Hopf  equation  as 


*  =  /r'r'  (3-54) 


Denoting  the  vector  formed  by  reversing  the  elements  of  the  vector  g  as  g^,  we  note  from 
equations  (3-51),  (3-52),  (3-53)  and  (3-54)  that 

g‘‘  =  W„  (3-55) 


and  the  ensemble  error  variances  for  the  forward  and  backward  predictors  are  identical  si.nce 
(r  =  (r^)  V  and 

£11  hj\^]  =  HO)  -  (r  ’’)^g  =  r(0)  -  (r^)  V  =  KO)  -  =  E\\fj  1^  ]  (3-56) 

C.  The  Lattice  Filter  Structure 


We  know  extend  the  results  of  the  previous  .section  to  the  vector  ca.se  of  interest  and 
the  consider  the  GSC  data  vector  Xs  derived  in  the  last  chapter.  The  lattice  filter  solves  the 


79 


prediction  problem  by  finding  orthogonal  bases  for  the  subspaces  X*-i  and  X*-  The 
Wiener-Hopf  solution  for  the  TDL  structure  optimum  filter  derived  in  chapter  I  determined 
the  weighting  coefficients  ass(Kiated  with  each  basis  vector  of  the  subspace  of  past 
observations  such  that  the  prediction  error  was  orthogonal  with  respect  to  that  subspace. 
The  lattice  structure  differs  in  that  one  first  constructs  an  orthogonal  basis  of  the  subspace 
of  past  ob.scrvations,  and  then  projects  the  vector  Xs{k)  successively  onto  the  orthogonal 
basis  vectors.  Consequently,  since  the  projections  arc  formed  onto  the  orthogonal  basis 
vectors,  successive  stages  of  the  lattice  are  decoupled.  Hence,  one  may  increase  the  order 
of  the  filter  by  adding  additional  stages  to  the  lattice  while  the  original  lower  order  predictor 
remains  optimal  in  the  expanded  structure.  Thus,  it  is  no  longer  necessary  to  use  the 
fixed-order  Wiener-Hopf  equation  to  determine  the  optimal  filter  coefficients. 

1)  The  Optimal  Lattice  Filter:  The  lattice  filter  structure  is  derived  by  employing  a 
recursive  formulation  of  the  Gram-Schmidt  orthogonalization  procedure  for  orthogonal 
projections.  Following  Strobach  [26),  we  denote  the  inner  product  of  two  vectors  A  and 
B  as  <A  ,  B>.  Let  the  (K-1)  dimensional  vector  at  stage  n,  F„,  be  the  complement  of  the 

orthogonal  projection  of  a  vector  Xs  ontathe  subspace  Xn  denoted 

Fn  =  x.<  Xn-1  ,  X,(n)>  =  ;if,<  Xn> 

with  the  property 

fI  X,(n)  =  0  1  S  n  s  y  (3-58) 

then  the  orthogonal  complement  F„  of  the  n~th  order  projection  can  be  constructed 
order-rccursively  from  the  orthogonal  complement  F„.\  of  the  (n-l)-r/f  order  projection 
using  the  recursion  formula 
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Xt^  ^n- 1  »  Xj{n)>  ~  X]<  5Cn-2  »  Xs(n~  1  )>  +  Kn  Xs{n)<  ^n~2  •  X]{ft—  1 ),' 


(3-59) 


or,  equivalently. 


Fn  =  Fn-l  +  XnX,(n)<  Xn-1> 


(3-60) 


where  the  (K-1)  x  (K-1)  matrix  Kn  is  given  by 

Xn  =  (  Xs(f>)<  Xn-I>  Xs(n)<  Xn-1>  )  Xs(n)<  Xn-1>  Xs<  Xn-1> 

This  can  be  proved  by  considering  F„(Xn)  as  a  vector  constructed  by  the  linear  combination 


Fn{Xn)  —  Xs<  Xn-I^  Xn  Xs(n)< 


(3-62) 


Then,  F„  is  orthogonal  with  respect  to  ihe  subspace  Xn-  \  extended  by  the  vector  Xs(n)  and, 

equivalently,  is  orthogonal  to  Xn-l  and  Xs(fi)<  X/i-l>  if  and  only  if  the  parameter  Kn  is 
adjusted  such  that  the  Euclidean  norm  of  F„(Kn)  attains  a  minimum.  This  follows  directly 
from  the  geomeU'ical  considerations  of  the  Wiener-Hopf  solution  and  leads  to  a  least-squares 
determination  of  Kn  via  the  approach 

mmFl  F„  (3-63) 

K, 

Substituting  equation  (3-62)  into  (3-63),  taking  the  gradient  and  setting  it  equal  to  zero  gives 
■=  2  xj <  X](n)<  Xn-i>  +  2  Xn  Xi{n)<  X'’-)>X i(n)<  X”-!^  —  ® 

oKn 

which  yields  equation  (3-61)  and  determines  Kn  such  that  F„  is  orthogonal  with  respect  to 
the  extended  s  jbspacc  spanned  by  Xn-l  and  Xsin)  <  Xn- 1>.  or  equivalently,  with  respect  to 
the  subspace  Xn- 
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The  vector  ^i(i)  may  be  projected  successively  onto  the  components  cf  the  subspace 
of  past  observations  as  follows: 

Faik)  =  x,(k)  (3-65) 

Fi(k)  =  X,(k)<X,(k-l)> 

F2{k)  =  X,(k)<X.(k-\),X,{k-2)> 

FAk)  =  XAk)  <X,(k-l) ,  X.(k-2) . 

Similarly,  we  may  successively  construct  an  orthogonal  basis  of  the  same  subspace  as 

B(Hk)  =  X,{k)  (3-66) 

am  =  x,(k-i)  <x,(k)>  =  jr,(*-i)<Bo(Jt)> 

B2(k) «  XAk-2)  <X,(k) ,  X.(k-\)>  =  X,(k-2)<Bo(k) ,  Bi(t)> 

BAk)  =  XAk-J)  <XAk)  ^X.(k-l) .  ...X(k~M)> 

These  nonrecursive  decompositions  are  a  consequence  of  applying  the  Gram-Schmidt 
orthogonalization  procedure.  They  can  be  made  recursive  by  applying  equations  (3-59), 
(3-60)  and  (3-61)  to  the  last  terms  in  equations  (3-65)  and  (3-66): 

FAk)  =  XAkX  X,(k-1). . . .  X(*-n+l)>  +  li{k)XAk-n)<  X.{k-\), . . .  Mk-n+\)>  0-67) 

BAk)  =  XAk-n)<  XAk-l), . . .  Jf, (*-«+!)>  +  kd,[k)X,{k)<  XAk-l), . . .  J(,(t-/»+l)>  (3-68) 

Equations  (3-67)  and  (3-68)  may  be  expressed  in  terms  of  the  orthogonalized  vectors 
Fn-i(ifc)  and  fln-l(Jt)  at  stage  n-1,  establishing  the  recursive  laws 

FAk)  =  Fn-Ak)  +  l^n{k)B„.i{k-l)  (3-69) 

BAk)  =  B,.Ak-l)*K^Ak)Fn-Ak)  (3-70) 

with  the  initial  condition 

Fo(k)  =  B<j^k)^Xs{k)  (3-71) 

From  equation  (3-61),  the  matrices  Kiik)  and  K^Ak),  termed  the  forward  and  backward 
reflection  coefficient  matrices,  respectively,  can  be  defined  as 


S2 


(3-72) 


Kfn{k)  =  -  (Bn-i(k-\)  Bl^(k-l)J'  BU(k-\)J 

aS(*)  =  -  ^Fn-i(k)  FLmy  fF,.i(k)  Bji.i(k-l)'j 
It  is  noted  that  equations  (3-72)  and  (3-73)  are  identical  to  those  presented  by  Griffiths  [39, 
equations  1  la  and  lib]. 

The  preceding  derivation  was  concerned  solely  with  the  optimal  predictors.  The 
symmetry  of  the  autocorrelation  function  showed  that  the  optimal  backward  predictor 
coefficients  are  the  mirror  image  of  the  optimal  forward  prediction  coefficients  and  that  the 
backward  and  forward  prediction  errors  have  the  same  norm  (the  lengths  are  the  same,  but 
the  error  signals  themselves  are  not).  The  backward  prediction  errors  are  orthogonal  to  each 
other  and  time-shifted  versions  of  both  the  forward  and  backward  prediction  errors  are 
orthogonal.  Thus,  the  generation  of  a  sequence  of  backward  prediction  errors  by  a  lattice 
filter  consisting  of  n-stages  is  equivalent  to  a  Giam-Schmidt  orthogonalization  process 
applied  recursively  to  a  corresponding  sequence  of  input  samples.  Haykin  [27,  pp.  173-178] 
shows  that  this  transformation  of  the  tap-input  vector  Xs(.k)  into  the  backward 


prediction-error  vector  fly(k)  can  be  accomplished  through  the  premultiplication  of  the  input 
vettor  by  a  lower  triangular  matrix  L  (where,  from  the  preceding  section,  Q  =  L)  with  I’s 

I 

alo^  the  diagonal.  The  non-zero  elements  along  each  row  of  the  matrix  L  are  defined  by 

I 

the  tap  weights  of  the  backward  prediction-error  filter  whose  order  corresponds  to  the 

I 

position  of  the  pertinent  row  in  the  matrix.  This  mauix  L,  where  we  may  explicitly  write 
I  Bj(k)  =  LX^k)  (3-74) 

is  norisingular  and  hence,  there  is  a  one-to-one  correspondence  between  the  input  vector  and 
the  backward  prediction-error  vector.  It  is  again  emphasized  that  these  properties  are 
applicable  only  to  the  optimal  predictors. 
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2)  The  Adaptive  Lattice  Structure  GSC  Form  Processor:  The  GSC  form  array 
presented  in  Chapter  II  will  now  be  examined  with  a  data  dependent  adaptive  lattice  filter 
structure  replacing  the  TDL  processor.  This  section  follows  the  work  of  Griffiths  [37.38,39] 
and  Lee,  Chang,  Cha,  Kim  and  Youn  [48],  The  all  7£ro  fixed  coefficient  lattice  filter  has 
the  same  transfer  function  as  the  fixed  coefficient  TDL  filter,  and  the  scalar  filter  coefficient 
conversions  are  presented  clearly  in  Oppenheim  and  Schafer  [25].  The  lattice  filter,  as 
derived  above,  achieves  the  transfer  function  through  an  orthogonalization  procedure.  This 
property  of  the  lattice  structure  will  be  shown  to  be  capable  of  providing  desirable 
convergence  properties  in  the  adaptive  multichannel  structure.  The  recursive  form  of  the 
lattice  structure  GSC  is  presented  in  figure  44. 

The  basic  adaptive  lattice  stage  represented  by  each  box  in  figure  44  is  shown  in 
figure  45.  The  delayed  observation  data  sequence  X{k  -  [)  is  transformed  into  the  orthogonal 
sequence  Bi(k)  through  the  Gram*Schmidt  type  relations  described  in  equations  (3-69), 
(3-70)  and  (3-71)  and,  with  a  slight  change  of  notation,  repeated  below: 


Bo(k)  =  Fo(k)  =  X,(k) 

(3-75) 

Bi(k)  =  Bi.i(k-l)~w/’(k)Fi-i{k) 

(3-76) 

Fi(k)=:  Fi-\  (k)-w/(k)  Bi-\  {k-\) 

(3-77) 

These  order  u-  date  equations  relate  the  higher  order  forward  and  backward 
prediction  errors  to  lower  order  prediction  errors.  The  signal  Bi{k)  is  tlie  backward  residual 
at  stage  /,  and  Fi(k)  is  the  forward  residual  at  stage  /.  Both  of  the  residual  vectors  at  time  k 
are  of  dimension  (K-1)  x  1.  The  backward  and  forward  reflection  coefficient  matrices 
w/’ik)  and  Wf(k)  are  of  dimension  (K-1)  x  (K-1)  and  are  commonly  termed  partial 
correlation  (PARCOR)  coefficients.  The  residual  vectors  in  equations  (3-76)  and  (3-77)  arc 
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recursively  updated  through  the  use  of  the  LMS  algorithm  to  minimize  their  mean  squared 
norm  value: 

Vff  {k+l)  =  wf  {k)  +  )i/(k)Bi.x{k -DFi^ik) 

Wi'’\k^l)  =  w!'\k)  +  iii\k)Fi-i{k)Bi^(k) 

The  optimal  PARCOR  coefficients  are  independent  of  the  filter  order,  so  that  the  PARCOR 
values  in  any  one  stage  do  not  depend  on  those  of  other  stages. 

The  Gram-Schmidt  type  of  orthogonalization  v/hich  the  lattice  filter  stages  form  may 
increase  the  speed  of  adaptation  in  subsequent  stages.  The  residual  becomes  increasingly 
white  as  the  order  of  the  filter  increases.  The  backward  residuals  from  stage  to  stage  are 
orthogonal  after  the  PARCOR  coefficients  converge,  resulting  in  the  aforementioned  overall 
convergence  rate  increase. 

Consider  the  lattice  filter  structure  implementation  of  the  GSC  shown  in  figure  44. 
The  J  coefficient  vectors  Gi{k)  are  of  dimension  {K-1)  and  utilize  the  LMS  algorithm  to 
minimize  the  mean  squared  value  of  the  /-th  stage  error  signal  E,  ( it ): 

€c{k)^yc(k)-Gllk)B„{k)  (3-80) 

t,(.k)  =  t^x(k)-cl^(k)Bi{k)  (3-81) 

Ci{k+l)  =  Ci[k)-i-\i^[k)ei{k)Bak)  (3-82) 

The  time- varying  step  size  gains  are  normalized  to  the  input  signal  variance,  and  are  diagonal 
matrices  given  by 


11(*  Ujj!  )  = 


Izl. 


Po.,(a)+^^j^fl(?(it) 


else 


(3-83) 
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■\  ■  ^  '  ,  -  ,  . 


Vif(ii,k)  = 


_LdL 


P  Oa,.,  (U-l  )  +  Ak-l) 


0 


{ij,k  )  = 


P  or^,  {»,*; )  +  Ak) 


‘=J 

else 

i=J 


(3-84) 


(3-85) 


0  else 

Once  the  PARCOR  coefficients  have  converged,  the  convergence  rate  of  the  lattice 
structure  CSC  estimation  weight  vector  Gi(k)  is  no  longer  dependent  on  the  eigenvalue 
spread  of  the  correlation  matrix  Rx„  but  upon  the  (K-1)  x  (K-1)  dimensional  correlation 
matrices  of  the  forward  and  backward  prediction  errors.  It  is  this  property  which  provides 
the  capability  for  a  faster  convergence  rate  which  can  not  be  achieved  with  the  corresponding 
TDL  processor. 

The  process  of  generating  the  backward  prediction  error  process  from  the 
observation  process  can  in  general  be  represented  by  the  mau-ix  operator  Q  =  L  where 

d{k)  =  Uk)  k^k)  (3-86) 

where  the  matrix  L(k)  is  data  dependent  and  changes  with  each  adaptation  in  accordance 
with  equation  (3-76).  Once  the  PARCOR  matrices  converge,  this  transformation  takes  on 
the  lower  diagonal  form  mentioned  in  section  III.C.l  and  the  input  to  the  conventional 
weighting  structure  G(it)  =  [Go(it)  Gi(it) . . .  G/-i(k)]^is  orthogonal  due  to  the  realization  of 
a  Gram-Schmidt  transformation. 
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The  direct  implementation  of  the  Gram-Schmidt  aljiorithm  serves  as  an  alternative 
method  of  realizing  a  completely  orthogonal  signal  set  to  serve  as  the  input  to  the  adaptive 
processor.  The  direct  Gram-Sv.hmidi  I'rthogonal  structure  utilizing  the  LMS  algorithm  was 
first  developed  hy  Griffiths  |  and  utilized  a  ci'nstani  step  size.  This  was  later  mtxlified 
by  Lee  et  al.  (4S1  to  irvlude  hoth  a  ume-vary  mg  step  size  and  an  escalator  realization,  where 
the  unit  lower  tnangutar  transform  d  l.T>  facton/atum  is  utilized.  Following  Griffiths  (.^9], 
the  structure  may  he  realized  in  Uv  f^rm  of  figure  ^9.  wlx  a'  the  matrix  Q  is  compo.sed  of 
time-varying  ciKfficicnt.s  and  the  t  K- 1  ij  »>Ltputs  satisfy 

*  i-O  m*n  (.V87) 

The  mauix  Q  Is  Iowa  u-iangular  and  composed  of  elements  q,j  which  may  be 
reprc.sented  as 

r  1  0  0]  (3-88) 


■  1  0 

O' 

q:.i  1 

• 

1 

0 

?(«-lV.(*  1)/  1 

1 

The  orthogonalization  procedure  generates  the  orthogonal  output  Zm{k)  via  the  recursive 
relationship 

ym,m  =  tjk)  (3-89) 

ymMk)  =  Zm(.k) 

m-l 

Zmik)  —  ^  rm.n(^)  Zn{k)  2  ^  IT]  ^  {K—\)J 


Figure  46  Gram-Schmiat  Strecture  GSC  Form  Array 
Processor  Lower  Path 

where  the  value  of  Cm.n  is  chosen  to  minimize  the  local  values  £[>’m.n+i(it)]  shown  in  figure 
46.  In  conjunction  with  the  method  of  gradient  descent,  we  may  write 


3  CmAk) 


This  result  may  be  achieved  through  the  use  of  the  LMS  algorithm  to  update  the 


adaptive  coefficients  Cm.n 


—  Cm,nik)  +  Zn{k) 


where  is  the  time  varying  step  siz^ed  formed  in  the  same  manner  earlier  established  in 


this  research, 
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The  matrix  Q  in  equation  (3-88)  is  then  given  by  [/  +  C]“\  where  C  is  lower  triangular  with 
zeros  on  the  diagonal  and  elements  Cm,«.  The  form  of  the  Gram-Schmidt  structure,  presented 
in  figure  46,  depicts  the  generadon  of  the  mauix  C.  The  orthogonality  in  this  s^ucture  is 
complete  after  the  convergence  of  the  adaptive  coefficients  via  the  LMS  algorithm  in 
equation  (3-91). 


E.  Example  and  Transient  Analysis 

Thus  far  in  this  research,  we  have  derived  and  examined  the  TDL  su'ucture,  DFT 

I 

■  j 

and  DCT  orthogonal  transform  structure,  lattice  structure  and  Gram-Schmidt  orthogonal 

structure  GSG  form  lineariy  constrained  MVDR  adaptive  array. 

j 

This  section  will  be  explicitly  concerned  with  evaluating  the  transient  behavior  of 

i 

the  adaptive  structures  under  consideration.  The  ensemble  mean-square  error  for  each  case 

I 

will  be  estimated  and  compared.  The  performance  as  a  function  of  computational 
complexity  will  be  examined  and  related  to  the  more  computationally  expensive  least 
squares  techniques,  utilizing  Sample  Matrix  Inversion  (SMI)  [34]  as  a  reference. 

The  first  example  to  be  considered  is  a  continuation  of  the  the  simulation  from  the 
second  chapter.  A  second  example  is  then  generated,  where  the  array  to  be  considered  for 
the  simulation  is  composed  of  ten  linear  sensors  (K=10)  equispaced  at  half -wavelength.  The 
sampled  signals  are  delayed  via  an  FIR  filter  of  order  eight  (J=8).  The  tap  spacing  in  both 

examples  defines  a  frequency  of  fo  =  ^  There  is  one  desired  signal,  in  whose  direction 


the  array  is  assumed  to  be  pre-steered.  The  look  direction  sensor  noise  is  omitted  from  both 
of  these  examples. 
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The  first  example  examines  the  transient  behavior  of  the  array  via  simulation  for  the 
TDL,  DFT,  DCT,  lattice  and  Gram-Schmidt  structures  GSC  Ws2  form  constrained  adaptive 
array.  The  propagating  signal  descriptions  remain  unchanged  from  table  I  in  the  example 
of  Chapter  II.  The  behavior  of  the  structures  are  characU'rized  by  the  estimated  mean-square 
error  and  estimated  mean  transform  domain  weight  transients.  The  mean-square  error 
estimate  for  each  structure  was  formed  by  averaging  the  output  power  of  each  processor 
over  two  hundred  independent  simulations  (consisting  of  three  hundred  adaptations  each), 
and  the  mean  weight  vector  values  were  similarly  averaged.  Again,  the  same  observation 
data  was  provided  to  each  adaptive  filter  structure  during  the  independent  simulations  for 
consistency. 

The  graphs  in  figures  47  and  48  present  the  ensemble  weight  vector  trajectories  and 
learning  curve  for  the  TDL  structure.  The  graphs  presented  in  figures  49  -  50,  51  -  52,  53 
-  54  and  55  -  56  depict  the  analogous  results  for  the  DFT,  DCT,  lattice  and  Gram-Schmidt 
structures.  The  benefit  of  using  of  an  orthogonal  transform  is  readily  apparent  by  the  better 
mean-square  error  performance  of  all  such  structures  compared  to  the  TDL  in  figure  48. 
The  behavior  of  the  time-vaiying  orthogonal  lattice  and  Gram-Schmidt  structures  appear 
nearly  identical,  as  was  expected  by  their  derivations.  The  DCT  frequency  domain  structure 
performs  nearly  as  well  as  the  time- varying  orthogonal  structures  while  the  DFT  structure’s 
performance  is  only  slightly  better  than  that  of  the  TDL.  To  better  depict  the  situation,  figure 
57,  58,  59  and  60  present  the  TDL,  DFT,  DCT  and  lattice  structure’s  performance  (dotted 
lines)  versus  that  of  the  Gram-Schmidt  (solid  line),  respectively. 


91 


95 


The  normalized  time-varying  step-sizes  for  each  structure  were  initialized  to  the 


value ^ - for  both  theTDL  and  frequency  domain  structures,  where  Z  is  the  transform 

'Wi/fW  ^/f\\  ^  ^ 


Z'(0)  Z(0) 


1 


domain  data  vector  (the  transform  is  Q=I  for  the  TDL),  and  to  the  value  — ; —  for  both  the 

Z\0) 

lattice  and  Gram-Schmidt  structures.  While  this  conveniently  removes  the  necessity  of 


choosing  an  initial  step  size  /  power  estimate,  it  does  result  in  the  minor  MSE  overshoot 
present  in  the  figures. 

The  mean-square  error  performance  of  the  .second  example  is  now  considered,  where 
the  array  consists  of  ten  sensors  and  eight  taps  per  sensor.  The  signal  characteristics  for  the 
ten  sensor  array  are  described  in  table  2.  It  is  noted  that  jammer  #2  is  now  centered  at  the 


.same  frequency  as  the  desired  signal  and  that  it  has  a  larger  bandwidth. 


Table  2  Signal  Characteristics 

SOURCE 

0 

POWER 

CENTER 

FREOUENCY 

BANDWIDTH 

desired  sisna' 

o° 

0.001 

0.3  ro 

0.1 

jammer#! 

-12.56° 

1.0 

0.4 /o 

0.05 

jammer  #2 

-16.56° 

5.0 

0.3 /b 

0.15 

jammer  #.1 

25.58° 

10.0 

0.2/0 

0.07 

The  mean-square  error  performance  of  the  TDL.  DFT,  DCT,  Lattice  and 


Gram-Schmidt  structures  are  depicted  in  figures  61, 62,  63,  64  and  65,  respectively.  The.se 


results  were  generated  by  averaging  the  mean-square  error  of  two  hundred  independent 
simulations  consisting  of  five  hundred  adaptations  each.  The  performance  of  the  TDL,  DFT, 
DCT  and  lattice  structures  (dotted  line)  versus  that  of  the  Gram-Schmidt  (solid  line)  are 


presented  in  figures  66,  67,  68  and  69. 
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Figure  69  Lattice  vs  G-S  MSE 

The  results  are  very  similar  to  those  of  the  last  example.  The  performance  of  the 
DFT  is  not  much  better  than  that  exhibited  by  the  TDL  structure.  The  DCT  structure’s 
performance  is  considerably  better  than  the  TDL.  The  lattice  and  the  Gram-Schmidt 
structure’s  performance  is  nearly  identical.  It  is  noted  that  the  DCT  structure  achieves  the 
same  mean-square  error  performance  as  the  Gram-Schmidt  after  approximately  two  hundred 
adaptations,  while  both  the  DFf  and  the  TDL  structures  do  not  attain  that  level  throughout 
the  five  hundered  adaptations  period. 

These  results  may  be  viewed  as  presenting  a  graphical  representation  of  the 
capability  each  structure  has  to  provide  an  uncorrelated  signal  set  to  the  processor.  The  DCT 
and  DFT  both  incorporate  data-independent  transforms  and  therefore  are  not  able  to  change 
in  time  with  the  input  process.  Tlius,  given  that  the  LMS  step  sizes  are  computed 
equivalently  across  the  different  structures,  the  performance  increase  of  these  transform 


domain  structures  over  the  TDL  is  limited  by  the  capability  of  the  fixed  transform  to  produce 
a  diagonal  correlation  matrix.  The  Gram-Schmidt  structure  is  data  dependent,  and 
continuously  attempts  to  provide  an  orthogonal  output  based  on  the  input  process.  The 
lattice  structure  maps  the  input  data  to  an  orthogonal  basis  through  independent  stages,  as 
described  in  section  III.C.  Therefore,  after  the  convergence  of  the  PARCOR  coefficients, 
the  lattice  structure  also  produces  a  completely  orthogonal  output. 

The  computational  requirements  of  the  different  structures  are  now  compared.  The 
measure  of  computational  complexity  used  will  be  the  number  of  adaptive  coefficients 
required  for  the  realization  of  each  structure.  Since  the  LMS  algorithm  is  being  used  for  all 
structures,  the  number  of  required  operations  (multiplications  and  additions)  for  each 
coefficient  will  be  the  same,  except  for  the  DFT,  where  the  operations  are  complex.  Thus, 
this  measure  is  reasonable  and  provides  a  comparable  quantity. 


The  required  number  of  adaptive  coefficients  for  the  structures  is  presented  in  table 
3.  For  the  array  used  in  the  second  example,  K=10  and  J=8  so  that  the  TDL,  DFT  and  DCT 


Table  3  Computational  Requirements  in  Terms  of  Adaptive  Coefficients 

STRUCTURE 

ADAPTIVE  COEFFICIENTS 

TDL 

(K-DJ 

Frequency  Domain 

(K-1)J 

Lattice 

(K-1)J+2(K-iAJ-1) 

Gtam-Sebmidt 

(K-])J+r(K-l)J|[(K-l)J-ll/2 

structures  required  72  coefficients,  the  lattice  required  1,206  and  the  Gram-Schmidt  structure 
required  2,628  adaptive  coefficients. 


Sample  matrix  inversion  is  a  weight  determination  approach,  and  Gram-Schmidt  an 
algorithm  for  solving  the  SMI  or  least  squares  problem  [55].  Thus,  the  Gram-Schmidt 
structure  provides  a  means  of  realizing  the  SMI  algorithm.  Gerlach  [56]  and  Youn  [55] 
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recently  commented  on  the  convergence  behavior  of  these  two  algorithms,  and  both  agree 
that  they  are  numerically  equivalent  assuming  infinite  numerical  accuracy.  Therefore,  a 
direct  comparison  can  be  made  of  the  computational  requirements  (in  terms  of  the  number 
of  needed  adaptive  coefficients)  with  respect  to  the  least  squares  algorithms,  where  the  LMS 
version  of  the  Gram-Schmidt  structure  presented  here  is  the  lower  bound  in  terms  of  required 
operations. 


iV.  CONCLUSIONS 


The  purpose  of  this  research  was  to  investigate  methods  of  improving  the  transient 
response  of  constrained  adaptive  array  sensor  processors  while  simultaneously  satisfying  a 
requirement  for  limited  computational  resources.  The  adaptive  processor  using  the  LMS 
algorithm  provides  the  smallest  computational  requirements.  This  processor  was  developed 
for  the  standard  TDL  structure  in  chapter  I  and  investigated  in  terms  of  the  constrained  array 
sensor  problem  in  chapter  II.  It  was  shown  that  the  tradeoff  for  computational  simplicity 
was  a  transient  behavior  dependency  upon  the  eigensuucture  of  the  correlation  matrix  which 
described  the  signal  and  array  geometry.  This  same  problem  motivated  the  development  of 
more  expensive  least  squares  techniques  which  led  to  a  solution  exhibiting  independence  of 
the  correlation  matrix  eigenstructure.  Thus,  the  course  of  action  undertaken  in  this  research 
was  to  investigate  methods  of  improving  the  convergence  properties  of  the  LMS  array 
processor  in  order  to  gain  performance  similar  to  that  of  the  least  squares  algorithms  while 
maintaining  computational  simplicity. 

The  development  and  utilization  of  the  GSC  form  linearly  constrained  MVDR 
adaptive  array  sensor  in  chapter  n  allowed  tlie  adaptive  processor  to  be  realized  in  an 
unconstrained  manner  while  the  overall  solution  satisfied  the  constraint.  This  feature  of  the 
GSC  provided  the  motivation  to  replace  the  standard  TDL  structure  processor  with  an 
orthogonal  filter  structure.  In  chapter  HI.  the  DFT  and  DCT  frequency  domain  structures, 
the  lattice  structure  and  a  direct  implementation  of  the  Gram-Schmidt  structure  were 
investigated.  The  results  of  the  simulations  presented  in  chapter  HI  clearly  depict  the 
advantages  of  orthogonal  structures  for  the  linearly  constrained  MVDR  processor. 
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The  fixed  transform  frequency  domain  structures  have  been  shown  to  provide  an 
improvement  in  the  transient  behavior  of  the  adaptive  array  at  no  increase  in  the  (K-l)J 
adaptive  coefficient  computational  requirements.  However,  this  structure  may  still  be 
dependent  upon  the  eigenvalue  spread  of  the  correlation  mauix.  Furthermore,  there  is  no  a 
priori  method  of  knowing  which  fixed  transform  will  provide  the  best  results  for  any  given 
observation  process.  In  general,  the  use  of  the  frequency  domain  structures  will  provide 
some  benefit  in  the  arrays  dynamic  behavior  as  long  as  subband  normalization  is  used,  and 
the  benefit  may  be  great  depending  upon  the  effectiveness  of  the  transform  and  the 
interference  dynamic  range. 

The  DCT  structure  processor  provided  an  effective  orthogonal  transform  which,  with 
the  use  of  subband  normalization,  led  to  a  transient  behavior  which  was  extremely  close  in 
performance  to  that  of  the  lattice  and  the  Gram-Schmidt  structure.  Furthermore,  as  depicted 
in  table  3,  the  structure’s  real  valued  DCT  transform  required  only  one  matrix  multiplication 
more  computational  complexity  than  the  TDL  and  no  additional  adaptive  coefficients. 

The  transient  performance  of  the  lattice  structure  is  greatly  improved  over  that  of  the 
TDL  and  frequency  domain  structures.  If  the  increased  computational  requirements  of 
(K-1)J+2(K-1)  (J-1)  are  acceptable,  then  the  lattice  structure  is  the  processor  of  choice.  It 
is  noted  that  this  computational  complexity  is  less  than  that  required  for  least  squares 
methods.  The  convergence  time  of  PARCOR  coefficients  in  the  lattice  structure  are 
dependent  upon  the  eigenstructure  of  independent  backward  prediction  error  correlation 
matrices  of  smaller-order  than  the  original  correlation  matrix.  Once  these  coefficients 
converge,  a  completely  orthogonal  process  serves  as  input  to  the  standard  LMS  adaptive 
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processor.  Since  the  PARCOR  coelTicients  are  similarly  updated  via  the  LMS  algorithm, 
the  computational  increase  was  able  to  be  directly  compared  in  uble  2  of  the  last  chapter. 

The  Gram-Schmidt  structure’s  performance  was  .seen  to  be  the  best.  It  provides  an 
orthogonal  output  to  the  adaptive  filter  via  a  direct  orthogonali/ation  process.  The 
Gram-Schmidt  structure,  however,  suffers  from  a  large  computational  burden  in  adapting 
the  ({K-l)J][(K-l)J-l]/2  LMS  coefficients  required  for  orthogonali/ation. 

In  those  applications  where  the  DCT  performs  as  well  as  in  the  examples  of  Chapter 
III,  the  DCT  frequency  domain  structure  should  be  the  proce.ssor  of  choice  for  pcrl'ormance 
versus  complexity.  The  lattice  structure  provides  the  best  overall  performance  for  cost, 
providing  a  nearly  equivalent  behavior  to  the  Gram-Schmidt  .structure.  Either  of  the.se  latter 
two  structures  will  provide  a  transient  performance  that  is  numerically  equivalent  with  the 
least-squares  techniques. 

Areas  for  further  re.search  include  studying  the  behavior  of  the.se  structures  when  the 
input  process  is  non-stationary,  investigating  the  utility  of  an  adaptive  GSC  blocking  matrix 
in  order  to  serve  as  a  prc-proces.sor  and  provide  a  more  uncorrelatcd  input  to  the  adaptive 
filter  structure  (especially  for  the  case  of  the  structure  employing  a  fixed  frequency  domain 
transformation),  analy.sis  of  other  (|)rthogona!  U'ansforms  in  the  frequency  domain  GSC 
structure,  and  sensitivity  analysis  vi^  derivative  or  soft  consfaints  to  the  LMS  algorithm. 
Furthermore,  the  analysis  of  the  Short-Time  Fourier /cosine  Transform  and  the  application 
of  wavelet  transforms  to  the  frequency  domain  sU’ucture  are  recommended  in  order  to 
provide  a  capability  to  track  the  input  firocess  statistics  in  a  manner  similar  to  that  achieved 
by  the  lattice  and  Gram-Schmidt  structures. 
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APPENDIX  I 


The  basic  reference  for  this  appendix  is  Lancaster  (18],  Throughout  this  section, 
r  will  denote  a  general  field  and  R  will  denote  the  set  of  real  numbers. 

L  A  subspace 


Let  ^  denote  a  linear  space  over  a  field  F  and  consider  a  nonempty  subset  ^  of  the  elements 
from  The  operations  of  addition  and  scalar  multiplication  arc  defined  for  all  elements  of 
^  and,  in  particular,  for  all  elements  belonging  to  ^  .  If  the.se  operations  are  clo.sed  in 
so  that  for  scalars  a  .yand  vectors  A  ,B :  A  e  i,o  ,B  e  ^ 

I 

0/4  +  yB  e 

then  we  .say  that  is  a  subspace  of 

. 

If  ^  is  a  linear  space,  it  is  readily  verified  that  if  0  is  the  zero  element  of  then  the 
singleton  {0}  and  the  whole  space  ^  are  subspaces  of  The.se  are  the  trivial  subspaces.  It 
is  important  to  note  that  the  zero  element  of  ^  is  necc.s.sarily  the  zero  element  of  any  subspace 

i 

^  of  This  cart  be  shown  to  be  true  by  considering  a  A  =  0  where  a  =  0  is  .scalar  and 

i 

A  e  ^  .  I 

Definition  A 1-2:  A  null  space  or  kernel 


The  set  of  all  vectors  W  such  that  A^  W  =  0  is  the  nullspace  or  kernel  of  the  matrix  A  and 
is  written  Ker  A  . 

Proposition  Al-1:  Let  A^  e  p"  ’ Then  the  .set  of  all  .solutions  of  the  homogeneous 


equation  A  =0  forms  a  .sub.space  of  F”  which,  by  definition  A 1-2,  isl.  =  ker  A. 


116 


Proof  of  Proposition  Al-1:  Lei  Y  ,Z  :Y  e  I,,Z  e  Ir.  Then  the  vectors  must 
satisfy  A^Y  =  OandA^Z  =  0.  For  any  scalars  a  and  y  the  vector  A^(aY  +  yZ  )  =  0. 
Hence,  Z  =  ker  A  is  a  subspace. 

Definition  A 1-3:  A  range  space  or  image 

A  dual  concept  to  that  of  the  null  space  is  the  range  .space  or  image  of  a  matrix  A  ,  denoted 
ImA  .  LelA^e  r"",then 

im  /t  =  y  e  r"  ;  y  =  /I  for  some  W  e  P"; . 

Definition  A 1-4:  A  span  of  a  subspace 

Any  subspace  containing  the  elements  a,  "  ^  must  also  contain  all  elements  of  the  f t  rm 
n 

^  a,  a,  for  any  ai  e  P  .  This  implies  that  the  .set  of  all  linear  combinations  over  P  of 
(  =  I 

the  elements  <0/  ”  belonging  to  a  linear  space  ^  generates  a  subspace  ^  of 

'  i  =  1 

It  can  be  seen  that  the  subspace  above  is  the  minimal  subspace  of  ^  containing 
ai  "  in  the  sense  that  d  for  any  subspace  which  also  contains  *0/  f  .  This 

minimal  subspace  is  called  the  linear  hull  or  span  of  <ai  >"  ^  over  P .  Thus, 

n 

span  a,  ,  =  a  e  ^  :  a  =  ^  «.  Oi  - ;  €  P  } 

i=  1 

Definition  A 1-5:  Idempotent  Matrix 
The  matrix  A  is  said  to  be  idempotent  if 

which  infers  that  for  any  positive  integer  i  and  idempotent  matrix  A  ,  A'  =  A. 
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Proposition  A 1-2:  If  /*  is  an  idempotent  matrix,  then: 


1.  /  -  P  is  idempotent. 

2.  Im(/-F)  =  iterP 

3.  ker  (/  -  P)  =  Im  F 

Proof  of  Proposidon  Al-2: 

1.  (/-P)^  =  /-2P  +  P^  =  /-P 

2.  if>e  Im(/-P).then>  =  (/-P)x  for  some xe  r".  Therefore, 

Py  =  P(l-P)X={P-P^)x  =  0 

* 

and  ye  ker  P.  Conversely,  if  =  0  ,  then  (l-P)y=y  and  y  e  Im  (/  -  P  )so  that 
Im(/-P)=JterP. 

3.  similar  to  the  above  argument,  \etyeker(I-P).  Then  Py=y  since 
(,I-P)y-y-Py  =  0  and  y  e  Im  P  so  that  ker  (I  -  P )  =  Im  P. 

Proposition  A 1-3:  If  P  is  idempotent,  then  iter  P  +  Im  P  =  T” 

Proof  of  Proposition  Al-3:  For  any  x  e  F"  we  can  write  x  =  xi  +  X2  ,  where 
xi  =  (/  -  P  )x  andx2  =  Px .  Note  thatxi  €  ker  P  while  X2  e  Im  P.  Hence,  the  whole  space 
is  defined  as  F"  =  kerP  +  Im  P.  Furthermore,  if  x  e  kerPnlmP,  it  must  be  the  zero 
element 

Definition  A 1-6:  A  projection  matrix 

An  idempotent  matrix  is  a  projection  matfix.  Each  idempotent  matrix  P  generates  two 
unique  mutually  complementary  subspaces  =  kerP  and  ^2  =  Im  P  and  their  sum  is  the 
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entire  space.  Thus  P  performs  the  projection  of  the  space  F"  on  the  subspace  ^2  parallel  to 
^1 ,  or  onto  ^2  along  ^1 . 
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APPENDIX  II 


This  appendix  presents  the  proof  that  the  GSC  processor  with  any  form  of  a  spatial 
matrix  filter  VV5  satisfying 

W,c=0  (A2-1) 

rank  (Ws)  =  (K  -  \)  J  (A2-2) 

where  the  constraint  matrix  C  is  defined  in  equation  (2-5),  will  yield  an  optimal  processor 
which  is  equivalent  to  both  the  partitioned  form  and  the  direct  form  CLMS  processors  under 
the  same  constraint.  This  derivations  follows  directly  from  Jim  [54], 

The  problem  may  be  stated  as  follows: 

Given  the  AT/ jc  /  dimensional  constraint  matrix  C;  the  relationship  VViC  =  0,  resulting  from 
the  fact  that  look  direction  signals  are  eliminated  from  the  GSC  lower  path;  the  KJ  x  KJ 
non-singular  matrix  [W?  C],  which  spans  the  entire  signal  space  since  the  rank  of  C  is ,/ 
and  the  rank  of  Wj  is  (K-l)J;  and  the  KJx  KJ  non-singular  symmetric  matrix  Rxx,  then,  from 
equations  (2-63)  and  (2-78), 

l-Wj^W.R^Wj}-'w,R,,  =  R-AC{C^RrJC)-'c^  (A2-.3) 


The  proof  is  now  developed  through  die  existence  of  an  orthogonal  non-singular 

i 

transformation  mauix  T  such  that  1 


rc= 


Then, 


(W,r’) 


=  0,  H'.r'iii/  01  =  ^ 


(A2-4) 


(A2-5) 


We  now  let 


/  / 

. 
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(A2-6) 


r/?„r  = 


A  B 
D 


and  consider  an  operation 

(rV'  [/-wf (w,R,.,wJf^w,R^= R-Jc (d'R'Jc)'  'd]  (r^) 

this  implies 

(A2-8) 

HW,rY[{W,r'){TR„T){w,r^ff\w,r'){TR»T'^)  =  (Tr,,T)-\tc)  i(7’c^{rau7’V-(7'Oi''(c^7'^) 
which  may  be  written  as 

=  (A2-9) 

To  show  that  equation  (A2-9)  is  valid,  let  k~'  be  given  by  the  block  partition  form  matrix 


A-' 


X  Y 
Z 


(A2-10) 


Then 


X=(A-BD-'B)-' 

y=-a''bz 

Z  =  ir'^  D'^B^A  -  BDB^T^BD-^  =  (D -  B^A~'BT 


(A2-11) 


Furthermore, 


XV 

YV 


[v'’  zv]  '[0  v^l 


(A2-12) 


0  yz"' 

0  / 


0  -a''b 
0  / 


and 
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[VAV^\\VAVB\ 

/  A''b 

[o  0 

Therefore,  equation  (A2-3)  holds  and  the  proof  is  complete.  Jim  [54]  also  notes  that  there 
is  no  further  restrictions  on  possible  forms  of  the  matrix  filter  Ws  other  than  satisfying 
equations  (A2-1)  and  (A2-2).  If  Wi  is  valid,  then  so  is  any  non-singular  transformation  of 


v‘ 

0 


Ws. 


•o  s.  GOVEHNMENT  MINTING  Of  FICE;  ISSS-TlO-O^S-Slllig 
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UISSION 

OF 

ROME  LABORATORY 

Rome  Laboratory  plans  and  executes  an  interdisciplinary  program  in  re¬ 
search,  development,  test,  and  technology  transition  in  support  of  Air 

0 

Force  Command,  Control,  Communications  and  Intelligence  (C  l)  activities 
for  all  Air  Force  platforms.  It  also  executes  selected  acquisition  programs 
in  several  areas  of  expertise.  Technical  and  engineering  support  within 
areas  of  competence  is  provided  to  BSD  Program  Offices  (POs)  and  other 
BSD  elements  to  perform  effective  acquisition  of  C  l  systems.  Jh  addition, 
Rome  Laboratory's  technology  supports  other  AFSC  Product  Divisions,  the 
Air  Force  user  community,  and  other  DOD  and  non-DOD  agencies.  Rome 
Laboratory  maintains  technical  competence  and  research  programs  in  areas 
including,  but  not  limited  to,  communications,  command  and  control,  battle 
management j  intelligence  information  processing,  computational  sciences 
and  software  producibility,  wide  area  surveillance/sensors,  signal  proces¬ 
sing,  solid  state  sciences,  photonics,  electromagnetic  technology,  super¬ 
conductivity,  and  electronic  reliability/maintainability  and  testability. 


